AMERICAN 
JOURNAL OF MATHEMATICS 


FOUNDED BY THE JOHNS HOPKINS UNIVERSITY 


EDITED BY 
R. BRAUER D. C. LEWIS, JR. 
UNIVERSITY OF MICHIGAN THE JOHNS HOPKINS UNIVERSITY 
L. M. GRAVES H. WHITNEY 
UNIVERSITY OF CHICAGO HARVARD UNIVERSITY 
A. WINTNER 


THE JOHNS HOPKINS UNIVERSITY 


WITH THE COOPERATION OF 


t L. BERS . L. V. AHLFORS C. B. ALLENDOERFER 
| J. L. DOOB R. ARENS R. BAER 

E. K. HAVILAND J. D. HILL P, HARTMAN 

| N. JACOBSON N. H. McCOY J. J. STOKER 

A. D. WALLACE J. L. SYNGE O. SZASZ 


PUBLISHED UNDER THE JOINT AUSPICES OF 


THE JOHNS HOPKINS UNIVERSITY 
AND 


THE AMERICAN MATHEMATICAL SOCIETY 


VOLUME LXxI 
1949 


THE JOHNS HOPKINS PRESS 
BALTIMORE 18, MARYLAND 
U. S. A. 


| 


> > > > + 4 


ON ORDERED GROUPS.* 
By B. H. NEUMANN. 


Certain axiomatic questions in geometry lead to the study of ordered 
division rings (cf. Hilbert [8]), and these in turn to the study of ordered 
groups: I only mention (without proof) that every ordered group can be 
embedded in (the multiplicative group of) an ordered division ring. 

F, W. Levi [9], [10] has given necessary conditions (not sufficient), 
and also sufficient conditions (not necessary), for a group to be capable of 
being ordered. In the first section of this paper the necessary conditions are 
generalised, a typical result being: If two elements of an ordered group are 
not permutable with each other, then none of their powers are. A similar 
result is also proved for certain higher commutators. Just how far one can 
proceed in this direction remains an open question: Some light is thrown 
on it by an example. 

The sufficient conditions of Levi (loc. cit.) can also be generalised. In 
the second section we derive a very general sufficient criterion (which is also 
necessary, but only trivially so). An order is actually constructed in a group 
when the criterion applies; if the group possesses an ordered factor group, 
then its order can be utilised in the construction. 

The general criterion is an unwieldy weapon. It can be specialised in 
various ways (8). From one of these specialisations one sees that all free 
groups can be ordered. (This result has also been obtained by G. Birkhoff 
and independently A. Tarski; ef. G. Birkhoff [3].) More generally we show 
that the order of any ordered group can be refined to an order of a free 
group of which the given group is a homomorphic image. Some special 
constructions of new ordered groups from given ordered groups are also given. 

In the fourth and final section these constructive methods are used to 
construct an ordered group which coincides with its commutator group. Such 
an example throws some light on the limitations of the various criteria and 
other results; it may also be of interest in itself, and is given in some detail. 


1, Necessary conditions for ordered groups. We call a group @ an 
0-group if it can be fully ordered, i.e., if a transitive binary relation a < bd 
can be defined in G, such that of the three alternatives a< b,a=b, b<a 


* Received February 13, 1947. 


OHIO UNIVERSITY 
LIBRARY 


3 
| | 
1 


2 B. H. NEUMANN. 


one and only one takes place, and a < b implies at < bt and ta < tb for all 
a, b, tin G. If G is an 0-group, and an order relation has been chosen for 
G, we call G simply an ordered group.’ The group consisting of the unit 
element only is an “improper” ordered group (with void order). 

We write G multiplicatively, denote the unit element by 1, and the order 
relation by <, even when dealing with several groups simultaneously: 
different order relations will be distinguished by the context. We also use 
the commutator notation of P. Hall [7]; thus 

Lz, z] y],2], 
and so on, with the corresponding notation for subgroups. | 

We call a group “locally infinite” if every element ~1 in it is of 
infinite order.2 The group which consists of the unit element only is 
“improperly ” locally infinite. 

Levi [9] shows that an 0-group is locally infinite, and more generally 
that the equation z”—a for ae G, m a natural number, has at most one 
solution x. One can show more generally: , 


1.1 Lemma. /f a, b are elements of an 0-group G, and [a™,b] =1 for any 
integer m0, then [a,b] —1. 


Proof. Assume that [a,b] ~ 1, and that [a,b] >1 in some order of G. 
Now 


[a™,b] = TI (a~*[a, 


p=m-1 
and 


=—1 
= (a*[a, 704) 
for all m>0. Hence [a™,b] is a product of conjugates of [a,b]; each of 
these is > 1, and therefore [a”,b] >1. Also [a,b] is a product of con- 
jugates of [a,b]-*; each of these is < 1, and therefore [a,b] <1., Hence 
if [a,b] 1, then [a”,b] ~1, and the lemma follows. 


1.2 Corottary. If two elements of an 0-group are not permutable with | 


each other, then none of their powers (+41) are permutable with 


each other. 


1 This is, of course, a special case of the o-groups of Everett and Ulam [4] and 
l-groups of G. Birkhoff [2]. 

* A group has a property locally if all its subgroups of finite rank (generated by a 
finite number of elements) have the property. 


if 


co 
| W 
de 
th 
fol 


ya 


ON ORDERED GROUPS. 3 


We can further extend 1.1 by establishing the following necessary 
condition for 0-groups. 


1.3 Lemma. Jf a, b are elements of an 0-group G, and [a™,b,a] =1 for 
any integer m +60, then [a,b,a] =1. 
Proof. Let m again be a positive integer. Then we expand 


a] TL (a~*[a,b]a*), a] 


p=m-1 


= (t.*[a“[a, a]ty), 


u=m-1 


=O 
where ¢, are certain part products of Ul (a“[a, dja“). Then 


=0 
[a™, b, a] ((a4ty)-* La, b, a] (a*ty) ) 

hence again [a”,b,a] is a product of conjugates of [a,b,a], and therefore is 
21 if [a,b,a] 21. Similarly 


[a-™, ba] =[ TI (a~#[a, a] 


u=—m 


(Uy [a[a, a] 


= TH ((at’s) a] (#*t’,)) 
TI ([a, b, a]*([a, 


Thus [a-", b,a] is a product of conjugates of [a, b,a]-? and therefore is $ 1 
if [a,b,a] 21. The lemma follows. 


14 Corotiary. If a, b are elements of an 0-group G and their commutator 
[a,b] is not permutable with a, then no commutator [a™,b] of a 
power (1) of a and b is permutable with any power (1) of a. 
Or: if [a,b,a] ~1, then [a™,b,a"] 41 for any m0, n~0. 


This follows by applying Lemmas 1.1 and 1. 3. 
One will naturally look for further generalisations of these necessary 


_ conditions for 0-groups. Two directions suggest themselves: One, to decide 


whether [a™,b,a,a]—1 (m0) entails [a,b,a,a] 1; the other, to 
decide whether [a”,b,c] =1 (m0) entails [a,b,c] =1. The first of 


_ these questions I can not answer; the second is answered, negatively, by the 
following example. 


or 
it 
y: 
u=0 
of 
is 
ly 
ne 
ny 

G. 
of 
n- | 

ce | 
th 


4 B. H. NEUMANN. 


1.5 Hzample. Let the group H be generated by elements 


with the defining relations 


1.51 [ Duss, bn | = p= 0, i, = 2, 

y= 2,3,° °° 

1. 53 [ bu, cv] = 1, pv=0, +1, + e's 

1. 54 [ cu, cv] = 1, p,v==0, +1, + 


We define an order in H such that 


where z << y means that all powers of x lie between y* and y. Thus any 
element of H is >1 if the highest-suffix db» in it appears with positive 
exponent, or if there is no by in it, and the highest-suffix cy in it appears 
with positive exponent. One can satisfy oneself without difficulty that in 
this way H does become an ordered group. The mapping 


Du Dass, Cust 


clearly defines an automorphism of H qua group, and this automorphism 
leaves the order of H invariant. We now define G by adjoining this auto- 
morphism to H, i.e., we form G = {H,a} with the relations 


and we order G by making a> 1 and a>>h for all he H. It is again easy 
to see that G thus becomes an ordered group. 
Now in G 


[a, bo, bo] = bo] = bo] = [b177, bo] —= co? < 1, 
but when m> 1 
[a™, bo, bo] = bo] = bo] 1. 
This example, therefore, proves 


1.6 Lemma. Jn an 0-group, [a,b,b] ~1 is compatible with [a™, b, b] =! 
for m>1. 


and thus a fortiori 


(q 


t 
e 
ti 
01 
H 
al 
< 


any 
tive 


in 


ism 
uto- 


easy 


ON ORDERED GROUPS. 5 


1.7% Corotuary. In an 0-group, [a,b,c] 1 is compatible with [a”, b,c] 
=1form>1. 


2. Sufficient conditions for ordered groups. We use the following 
criterion for 0-groups, adapted from one given by Levi [9]. 


2.1 Lemma. The group G is an 0-group if (and only if) it contains two 
subsets s* and s~ such that 


2.11 = G— {1}, 

i.e., every element of G, except the unit element, lies in s* or in s-; 

i.€., St and are semi-groups; 
2.13 t1stt C s+ for all te G, 

i.e., st (and therefore also s~) is self-conjugate in G. 


Proof. Let G possess two such subsets. Then, as they are semi-groups 
but do not contain the unit element, neither of them contains an element 
simultaneously with its inverse. But between them they contain all elements 
of G except 1. Hence of a pair of inverse elements one always belongs to s* 
and the other to s-. Now we define an order relation in G by 


2.14 a <b if and only if a*bes*. 


Then if a < b, b'aes-, and therefore b t{ a; also ba, as the unit element 
is not in s*. Hence of the three alternatives a < b, a=b, b <a, not more 
than.one takes place. But one of them does take place, as of the two inverse 
elements a-'b, one lies in s*, unless —1. Also ifa<b and b<e, 
then ab s*, s*; hence s*, by 2.12, and a < ¢; which shows transi- 
tivity of the order relation. Finally if a*b e s* then also (at)~*bt © s* because 
of 2.13, and (ta)-*tb es* trivially; i.e. if a<b then at < bt and ta < tb. 
Hence G is ordered. 

The converse is also true; for if G is an 0-group, we choose an order of G 
and then denote by s* the set of all elements > 1, by s~ the set of all elements 
<1. Then 2.11-13 are easily checked. 

We shall also use the following sufficient criterion, due to Levi [9] 
(q.v. for a proof) : 


®.2 THeoreM. A locally infinite abelian group is an 0-group. 


The most general sufficient criterion that we derive is the following. 


=] 
. 


6 B. H. NEUMANN. 


9 


.3 THEOREM. Let the group G possess a set of subgroups linearly ordered 
by inclusion : 

(not necessarily all different) with the following properties: 


2.31 Each Hg is self-conjugate in G, i.e., 2.30 is a generalised normal 


series of G. 
2.32 Each Hg except {1} has an immediate successor Hy in the series 2. 30.° 
2.33 For all terms of the series, 
[G,Ha] C He, 
1. €., 2.30 is a generalised central series of any one of tts terms. 
2.34 If 2.30 has a first term H,, then G/H, is an 0-group. 
2.35 Ha/Ha is (properly or improperly) locally infinite. 


36 To every element ge H, tf 2.30 has a first term H,, or to every 
element geG if 2.30 has no first term, there is a minimal Ha 
containing it, t.e., 9g =1 or there is an Hg with ge Ha — Ha. Then 


G ts an 0-group. 

Proof. By 2.33 and 2.35 each Ha/Hq’ is abelian and locally infinite, 
hence, by 2.2, an 0-group. In each Ha/Hzq: we choose * an order. We denote 
by stag (saHa’) the set of all elements of H, which are greater (smaller) 
than the unit element (mod H,-) in this order. If 2.30 possesses a first 
term H,, we also choose an order of G/H,, and define s*>H, (soH:) 
accordingly. Finally we introduce the union s* (s-) of all s*alla (saHa’) 


2.41 s* == | JastaHa’, = [Jas aHa:. 


Now let g 1 be an element of G; then (by 2.36) either there is an Ha 
such that ge Hz — Hg’, or 2. 30 has a first term H, and ge G—H,. Hence 
g is greater or smaller than the unit element (mod Hag or mod H;) in Ha 
or G: hence ge U saHw (or U soH:). In any case ges* Us, 


and 
2. 42 {I}. 


8 Note that the series 2. 30 need not be well-ordered; its order type is finite or made 
up, additively, from order types » and *w +, with a finite tail. 


cf. Fraenkel [5]). 
‘The axiom of choice is used. 2.2 requires well-order for its proof. 


(For the notation 


ir 
of 


4 

2 
anc 

to 
The 
Thi: 


de 
on 


ON ORDERED GROUPS. 


Let now g and h be two elements in s*. Let ge Ha — Ha’, he Hp — He, and 
Hg C Ha, say. Now if Hg Ha, i.e., Hg C Ha’, then gh is congruent to g 
(mod Ha’); then ghes*aHw and ghes*. If Hg= Ha, then g and h are 
both in the same s*alHq:, hence their product is. Hence again ghes*. 
Correspondingly if g or h lies outside H,. In any case 


2,43 
Similarly one proves 


Finally let ges*, let us say stag’; and teG arbitrary. Then® 


Thus 
Ha, 
and so 
© staHa. 


If, on the other hand, 2. 30 has a first term and g €s*oH, then also 
es* 


for s*o is self-conjugate in G/H,, and H, is self-conjugate in G@; hence s*o//: 
is self-conjugate in G. Thus we see that 


2.44 t-1s*t C st for all te G. 


Combining 2. 42-2.44 and 2.1, we see that G is an 0-group and the proof 
of the theorem is complete. 
If G@ is an ordered group, H a self-conjugate subgroup of G, and 
= (/H is ordered so that aH = bH in the order of K whenever a < 6 
in the order of G, then we call the order of G a “ refinement” of the order 
of K. Then the proof of Theorems 2.3 also shows: 


2.5 Corottary. Jf under the conditions of the criterion 2.3 the series: 
2.30 has a first term Hy, then any order of G/H, can be refined 
to an order of G. 


5 This is the only step in the proof which fully uses 2.33. One easily sees that 2. 33 
and 2.35 can be replaced by the following condition (which is not, however, equivalent 
to 2.33, 2.35) 

2.33’ Each H,/H,, is an 0-group, and in particular possesses an order which 

admits all inner automorphisms of G. 
Then for the purposes of the proof such an order has to be chosen in each H o/ Hy 
This modification of the theorem generalizes another criterion due to Levi [9]. 


e 
| 

y 

n 
e, 

e 

) 
st 

) 
la 

e 
la 


8 B. I. NEUMANN. 


38. Special methods for constructing ordered groups. Theorem 2. 3 is 
somewhat unwieldy. Some special cases may, however, be of interest. 
Let G be a group and denote by "G the terms of its lower central series: 


°G=G, = [G,"G]. 
Further denote by Z,(G) the terms of its upper central series: 
= {1}, the centre of G/Z,(G@). 
3.1 THeEorREM. Jf G is such that ® 


3.11 == {1}; 
3.12 "G/"*G is locally infinite for n=0,1,2,: then G is an 0-group. 
2 TueoremM. If G is such that’ 


Un 4Zn(G) =G; 


22 Znu(G@)/Zn(G@) is locally infinite for n=0,1,2,---; then G is an 
0-group. 

Both these theorems are easy consequences of 2.3. The set of groups 
Hg, consisting of the terms of the lower or upper central series, is here finite 
or of order type w (3.1) or *w (3.2), so that 2. 32 is satisfied. The definition 
of the central series assures 2.31, 2.33. Assumptions 3.11 and 3.21 entail 


2. 36, 3. 12 and 3. 22 are simply 2.35. 2.34 is also satisfied, as a consequence | 


of 2.2. Hence 2.3 applies. 


3.3 Corotuary. (Cf. Birkhoff [3]) All free groups (of finite or infinite 


rank) are 0-groups. 


3.4 THeoreM. If the ordered group G is represented as a factor group | 


of a free group F with respect to a relation group R, 
3.41 G=F/R, 
then the order of G can be refined to an order of F. 


Proof. We use the intersection of # with the terms "F of the lower 


central series of F, putting 
3.42 R, n=0,1,2,--- 


Now as & is self-conjugate in F, Ry is also self-conjugate in F, and 


° 3.11 means that G is an N-group in the terminology of Baer [1]. 
73.21 means that @ is a Z-group in the terminology of Baer [1]. 


Ci 


el 


to 


i 
f 
| 8 


n 


yer 


Lp | 


ON ORDERED GROUPS. 


3,43 [F, Rn] CR, CR. 
Also 
[F, Rn] C [P, °F] 
Hence 
3. 44 Ba] Res. 


Let ae F have a (proper) power in Rns 
Rasy, kA0. 
Then but as is locally infinite, ae"*F. Also ate R; but 
as F/R is an 0-group and therefore locally infinite, also ae R. Hence 
This means that is locally infinite, and therefore a fortiori 
is locally infinite. 
Finally 
3.45 Mn Ba = {1}. 


Hence every element re R, 71 is in a smallest Rn, 
The theorem follows now simply from Theorem 2.3 and Corollary 2. 5. 


We now give some fairly obvious results which can be used for con- 
structing new ordered groups from given ordered groups. Detailed proofs 
are omitted. 


3.5 The (complete *) direct product of any set of 0-groups is an 0-group. 
. For we can well-order the direct factors, and chose an order in each. 
The direct product is then simply ordered by the convention that an element 
is 21 according as the first component (in the well-order of the direct 
factors) =41° of the element is 2 1 in the chosen order of the factor. 
The following construction dispenses with well-order. 


3.6 Given an ordered set of 0-groups, their restricted direct product 
can be so ordered that the direct factors appear in the given set order. 


For we can choose an order in each direct factor. The restricted direct 


§T.e. without restriction upon the number (or cardinal) of components #1 of an 
element. 

®] stands for the unit element of all the groups that occur; similarly $ applies 
to the order chosen in the factors as well as to that under construction in the product. 
2°T.e., that in which every element has only a finite number of components + 1. 


9 
e 
n 
il 

| 


10 B. H. NEUMANN. 


product is then simply ordered by the’ convention that an element is 21 


according as the last component 1 (in the order of the set of factors) 
of the element is 2 1 in the chosen order of the factor. 


3.5 and 3. 6 are in fact only special cases of known results. Cf. Hahn [6]. 


As an application of 3.6 we give, in some detail, the following construc- 
tion, which will be used in the next section. 


3.7% Starting from an ordered group B we form the restricted direct product 


of an ordered set of type *» +. of factors Bn, n=0,+1,+2-- > each 
isomorphic to B: 
Bt =: - BaX BoX Bi xX BX: 


The elements are of the form 
3. 71 b* —- b-1,-1 boo XK K O22 


(where the first suffix distinguishes the direct factor in which the component 
lies, the second suffix the element of B which appears as this component) ; 
only a finite number of the components bn, are different from the unit element, 
and the last one of these determines whether b* 2 1. Now B* possesses an 
obvious automorphism (relating to its order as well as to the group operation), 
viz. that mapping each component B, on its successor Bny:. We denote this 
automorphism by a and extend B* by means of it; i.e. we form all the 


products 

3.72 g =a*b* 

with the transformation rule 

3. 73 a*b*a - X b.1,-2 K X 01,0 K 001 


(where b* is given by 3.71). The elements 3.72 form a group G, which 
we order first according to the power of a in it, then according to b*. Thus 


3.74 g 21if «20, or if and b* 21. 

If B is given by a system of generators b, b’,- - - and defining relations 
r(b, b’,- =77(b, b’,- - -) = -=1, then we can give in the same 
manner 

[a-ba, = [a-ba*, = 1, 


The commutator relations are formed for all pairs of generators b,b’,- - : 


0 
| 
eg 
be 


ON ORDERED GROUPS. 11 


of B, but may then be restricted to A=0, » > 0. If G@ is defined by 3. 75, 
the order in G can be described without reference to B*, solely in terms of 
a,B. To this end we represent an element of G in the form 


i=m 
3.76 g 


where > >* * > pm, all b; 1.7 This is possible because transforms 
of elements of B by different powers of a are permutable with each other; 
a is the sum of exponents with which a appears in g.’* Now the order is 
defined in G by 

3.77 g21if «20 or ~=0 and 21. 


This construction can be extended by replacing the powers of a by the 
elements of an arbitrary ordered group A. 
3.8 Let the ordered groups A and B be given by 


Then we form the group 


3. 83 G = {a,a’,--°,b,05°°°3; 3.84—.86} 
where the relations are 
3. 84 q(a, a’,: (aa: =: ‘=I, 
38. 85 
3. 86 [d, a,~*ba; | [d, a,1b’a, | —" ° | 
In the commutator relations 3. 86 * range over all elements > 1 
of A, and the elements of B involved range over all pairs of generators 
b,b’,---. An element geG can be represented in the form 
p=m 
3, 87 = IT (ap-*b pay) 


with a; > a2 >+ + + > @m in A, and with all bu #1. This is possible because 
transforms of elements of B by different elements of A are permutable with 
each other. Then G is ordered by the convention 


“If m= 0, the product is void. 
12 Easily seen to be an invariant as long as @ and elements of B are chosen as the 
generators of G. 


| 


12 B. H. NEUMANN. 


3. 88 g [1ifa 21 ora—1 and 21. 
The proof that in this way G becomes an ordered group is omitted. 


Finally we mention, without proof, a construction principle due to 
Steinitz [11]. 
3.9 Let a system & of ordered groups Gz be given with the property that 
to any two groups G., Gg in & there is a group G, in & which contains both 
Gq and Gg as subgroups ** and continues the order of both. Then there is 
an ordered group G containing as subgroups all the groups in &, each with 


its order, and generated by them. 


4, A perfect ordered group. Levi [10] shows that if an 0-group G 
is finitely generated then it is different from its commutator group G’. To 
show that the finiteness of the number of generators can not be dispensed 
with; to illustrate the limitations to our various sufficient criteria for 0-groups 
(2.3, 3.1, 3.2); and to demonstrate the application of our various construc- 
tive principles: we now construct an ordered group which is perfect,** i.e. 
coincides with its commutator group. 

Starting from an infinite cycle 


4. 01 H, = {b;} 
we first define a series HZ, of groups by repeated application of 3. 7. 
4. 02 = {b,, Do, be; [b,, bg \bsbq* | 1 (q > 1,8, AA 0 
Clearly H, is obtained from Hn_, by adding the generator bn» and relations 
which entail that two elements of Hy, transformed by different powers of Bn 
are permutable. The method of 3.7 can also be used to order Hn, when it 
can be seen that the order of H, continues that of Hn-1; but we. do not 
require the order at this stage. 

We now consider the elements 
4, 03 Cy = [ by, bn | by” ba v= 2, n— 


and denote by Kn-, the subgroup of Hn generated by these elements. We 


proceed to show that 
4.04 Kya = Anz; 


more particularly, the mapping 


18 A group G, may contain different subgroups isomorphic and similarly ordered to 
G,; but G, must be one of the subgroups of @,. 

14We use “ perfect ” in its group-theoretical sense. 

15 Here as later \ may be restricted to positive values. 


is 
i 
h 
fe 
T 
Th 
bu 
Th 
the 


ON ORDERED GROUPS. 13 


cv & by, 1,2,-°°,n—l, 


defines an isomorphism between Kn, and Hy. To see this we form any 
word +, bn-1) in the generators of Then 


for the expressions in cy? permute with those in b,‘bvbn. The two factors 
on the right-hand side of 4.05 lie in different components (viz. Hn. and 
bnitHn-+bn) of a direct product. Hence the left-hand side of 4.05 can equal 
the unit element only if both factors on the right-hand side do. Thus 


W (C1, Coy* * = 1 
entails 
w(d,, On-1) =1, 


and Hn-, is a homomorphic image of An-, under the mapping cy —> bv. 


Conversely let w(b;,b2,- + +,0n-1) =1. Then w is a product of con- 
jugates (in H,_,) of the left-hand sides of the defining relations for 
b:,b2,* + +, 0n+. Now these defining relations express the permutability of 
transforms by different powers of bg, of any two elements expressible in terms 
of 1, be,* + +,bg1. From these relations then follows also the permutability 
of transforms by different powers of bq, of any two elements expressible in 
terms of 6,71, Thus if 


w(bi,b2,° bn1) =1 
is a relation connecting the generators, then 


is a relation connecting their inverses. Hence in this case the whole right- 
hand side of 4.05 equals the unit element, and 


follows from 
w(b,, +, 1. 


The mapping cy = by generates, therefore, an isomorphism between Kn-1 and 


*° Note that K,_, does not, in general, contain all the elements g, _,-1b,,-19,°,6 
Thus it contains 


n° 


oo, = b,-1b,-1 b,-16,6,5,, 
but not 
(b,b,)-2- 
The intrinsic reason for the isomorphism of K, , and H,_ 


the scope of this paper. 


is interesting, but beyond 


1 1 


1 
« 


14 B. H. NEUMANN. 


Now similarly Hn-, contains in its commutator group a subgroup iso- 
morphic to Hn-2; hence Kn-; contains in its commutator group a subgroup 
In-2 isomorphic to Hn-2; and so it goes on. The idea of the construction is 
now to consider a sequence of groups -:-,L,K, rather than that of the 
groups H; in this way we ensure that each term of the sequence lies in the 
commutator group of its successor.** To do this we define groups Gn each 
isomorphic to H»; but such that an isomorphism from G, to Hn maps Gas 
on Kn-,, not on Hn. We define 


4.06 Gn = {11, G21, M22, M31, M32, * * Ann; 4.07, .08} 
with the relations 

4,07 [apr, = 1 for n=p=q>r,s; 

4,08 [ App] = for n=p>q. 


It is seen from 4.08 that Gn» can be generated by @ni,@n2,° * *,4nn3 
and those relations 4.07 for which p—n are the same as the defining 
relations of Hn in 4.02. Therefore the mapping 


generates a homomorphism of Hn onto Gn. 


To show that this homomorphism is an isomorphism we prove that all 
the relations 4.07 follow already from those for which p—n, together with 
4,08. The set of relations 4.07 for which p has acertain fixed value, p= m, 
say, will be denoted by 4. 07m for short; similarly we denote by 4. 08m those 
relations 4.08 for which p—=m. We show that 4.07%m-1 follow from 4. 07m 
and 4.08m; then 4.0%) for p=1,2,- --,n—1 follow from 4.07, together 
with 4. 08. 

Consider any word w formed of m—1 generators, w(@1,%2,° * *,@m-1). 
Then, using 4. 08, 


W * * 5 Am-1,m-1) = W([m1, Imm], [4m2,mm],* * Umm] ) 
By 4.0%m each Amm*dmsdmm permutes with each @mr, for r,s =1,2,° °°, 


m—l1. Hence 


"17 J, _, is not a subgroup of H,,_,; hence its relation to H,_, is not the same as that 
of K,,_, to H,. 
18 Tf two sequences of groups are given 
A,CA,C-.-- and B,CB,--- 
such that A, = B,, A, > B,,- - -, and if A and B are the groups generated by these 
sequences by the Steinitz method (cf. 3.9), then A and B need not be isomorphic. 


the 


d 
is 
t 
n 
tl 
( 
4, 
T 
4, 
W 
as 
be 
4, 
wl 
4, 


& 


m) 


at 


Se 


ON ORDERED GROUPS. 15 


We apply this in particular to the left-hand side of 4.0%m1, and obtain for 
m—1=q>17,8; A¥0, 


4.10 [am-1,r5 | 


= [Qmr™*, Omg Ons | [Qmr, Omg 


Here both commutators on the right-hand side equal the unit element by 
4.0%m, and 4.0%m-1 follows. 

Hence all the relations in Gn follow from 4.07, and 4.08. The latter 
are only explicit definitions of the generators apg, p <n, one for each, and 
no relations between @ni, * *,Qnn can follow from them. If we generate 
Gn by means of @ni, Qnz2,* * *;@nn Only, then 4.07, form a complete system of 
defining relations. Therefore the mapping generated by 


4, 11 Ani bi, Ano bo, Ann bn 


is an isomorphism between Gn: and Hn. It is seen without difficulty that 
this isomorphism maps Gn-; on Kn-1; but the groups Hn» and Kn. are now 
no longer needed. 

To define order in Gn we proceed as in 3.7; but in order to show that 
this order continues the order correspondingly defined for a subgroup Gm C Gn 
(m <n), we define the order simultaneously in the subgroup. To order Gm 
we form the chain of subgroups 


4.12 Gm= {ami}, Ga: = {Qmi, Amz}; Gam = {Qm1, * 
Then 
4, 13 Gi Gine Ginm — Gin. 


We proceed by induction. Gm: can be trivially ordered: am1* 2 1 according 
as A420. We assume that has been ordered already. Now let 
be an element of Gm. Then g can be expressed in the form (cf. 3.7) 
4.14 I= Ili Ang 

where Pa > + and all gi Then we define 


4,15 g21if A\20 or A—0 and g, 21 (in 


It is easy to confirm the usual properties of this order relation, and we omit 
the proof. 


° The product II ; may consist of a single factor, or be absent. 


ll 
h 

m 

3 
7 


B. H. NEUMANN. 


To compare the order relations in Gm and Gm-; let g € Gm-1 be expressed i 


as a word 
4.16 j= Ww (Am-1,15 Am-132)° 5 Om-1,m-1) + 


Then in Gm it can be expressed in the form 4.09, or preferably in the form : 


4,17 j= Amm*W Amm-1) mm W (Ami; . 


This is of the form 4.14 with g=m, X= 0, two factors in the product, 


=1, po =0. Hence g 21 according as 21. But 


it is clear that 
W(dm1,* * 1 
according as 
W (Am-1,15° Am-1,m-1) 1; 


for the first suffix m of the generators does not enter the definition 4. 15 at all. 
Hence g 2 1 qua element of Gm according as g 21 qua element of Gm. | 
By induction one then sees that the order of G@, coincides in Gm (m <n) | 


with the order of Gm. 


We now have all the material together to construct the example, by 


applying Steinitz’ method 3.9 to the series 
°° 
4.2 Example. Let Gs be the group generated by 
A115 M215 A315 Age, 335° * * 
with the defining relations 


4, 21 [ pr, Apq | =] for = qd > rs A 0; 


4, 22 [ App | = Aip-1,¢ for P > 


Relations 4.22 ensure that G’.—=G,. Let G. be ordered by the definition : 
4.15 when the element ge Gq is expressed in the form 4.14. Then Gy is af 


perfect ordered group. 


As Go coincides with its commutator group, its lower central series is | 


stillborn. So is its upper central series, for the center of Gj is easily seen 
to be {1}. One can show even more: 


4.3 Lemma. Every element >1 in Gu has arbitrarily large conjugates: 
if 1<g<h in Ge, then there is an element te Gs such that} 


>h. 


wc 


the 
mo 
is ¢ 


F 
16 
4 
4, 
| 
i 
| 
[2] 
2 


on 


ON ORDERED GROUPS. 17 


Proof. Let m be such that g and h both lie in Gms. We express them 
in the next higher group Gm, using the representation 4.17, but abbreviating 
it to 
4. 31 J = 15 
4, 32 h = dam 


where 91, 91, hi, h, are words in the generators @m1,Qm2,° * *,Qmm-1, and we 
also know that g, > 1. We put Then 


4, 33 t*gt ‘ho =x ‘hi? > 1, 
because g,; > 1; and the result follows. 
From this we see immediately: 


4.4 Lemma. If the self-conjugate subgroup HC Ge contuins with any 
element h also all the elements between h and its inverse,”® then 
H = {1} or H= Gg. 


This lemma allows us to show that Gw (in the given order) is what one 
would call “ordinally simple”: 


4.5 TnHeoremM. If G, is mapped homomorphically on an ordered group G* 
such that g <h in Gy implies g* = h* for the homomorphic images 
of g and h in G*, then either the homomorphism is trivial, 1. e., 
G* = {1}, or the homomorphism is an isomorphism. 


Proof. Let the kernel of the homomorphism be the self-conjugate sub- 
eroup H of Gwe If HA {1}, l1<heH, andl<g<h, geG, arbitrary, 
then 1 = g* [h* —1; hence ge H. Then H = G, by 4.4, and the homo- 
morphism is trivial. On the other hand, if H = {1}, then the homomorphism 
is an isomorphism; which proves the theorem. 


UNIVERSITY COLLEGE, 
HULL, ENGLAND. 


BIBLIOGRAPHY. 


[1] Baer, R., “The higher commutator subgroups of a group,” Bulletin of the 
American Mathematical Society, vol. 50 (1944), pp. 143-160. 

[2] Birkhoff, G., “ Lattice-ordered groups,” Annals of Mathematics (2) vol, 43 (1942), 
pp. 298-331. 


20 « Symmetric section” (Levi [9]) or “ isolated subgroup.” 


2 


i 
i 
| 
a 
is 
en 
at 
| 


18 B. H. NEUMANN. 


[3] Birkhoff, G., Review of Everett and Ulam, “On ordered groups,” Mathematical 
Reviews, vol. 7 (1946). 

[4] Everett, C. J., and S. Ulam, “On ordered groups,” Transactions of the American 
Mathematical Society, vol. 57 (1945), pp. 208-216. 

[5] Fraenkel, A., Einleitung in die Mengenlehre, 3rd ed., Springer, Berlin, 1928. 

[6] Hahn, H., “tber die nicht-archimedischen Gréssensysteme,” Sitzungberichte der 
mathematisch-naturwissenschaftlichen Klasse der Akademie der Wissen- 
schaften zu Wien (IIa), vol. 116 (1907), pp. 601-653. 

[7] Hall, P., “A contribution to the theory of groups of prime power order,” Pro- 
ceedings of the London Mathematical Society (2), vol. 36 (1933), pp. 29-95. | 

[8] Hilbert, D., Grundlagen der Geometrie, 7th ed., Teubner, Leipzig, 1930. & 

[9] Levi, F. W., “ Ordered groups,” Proceedings of the Indian Academy of Sciences, 
vol. 16, (1942), pp. 256-263. 

[10] Levi, F. W., “Contributions to the theory of ordered groups,” Proceedings of the © 
Indian Academy of Sciences, vol. 17 (1943), pp. 199-201. a 

[11] Steinitz, E., Algebraische Theorie der Kérper (ed. Baer and Hasse), de Gruyter, » 
Berlin, 1930. 


is 

se 

st 

on 

wi 

pa 

Si 

to 

| in 
at. 

tak 

We 

Bye. 

‘ set , 

and 
(2,4 

HC 


ON UNIFORM SPACES WITH A UNIQUE STRUCTURE.* 


By Raour Doss. 


: Let FE be a separated uniformisable space (see [1] and [2]). We first 
| prove, as a lemma, that if # is a filter on Z without contiguous point (point 
_ adhérent), then there exists a uniform structure, compatible with the topology 
- of £, for which ¥ is a Cauchy filter. 
_ We propose next to characterize the uniformisable spaces for which there 
is only one structure. In the space # we shall say that two closed disjoint 
- sets A and B are normally separable if there exists a real function f(z), 
, | continuous on LF, taking the value 0 on A and the value 1 on B. With this 
\ definition, in order that the separated uniformisable space # have a unique 
structure it is necessary and sufficient that of any two normally separable sets 
one at least be compact. 


Lemma. Let EF be a separated uniformisable space and § a filter on E 
without contiguous point. Then there exists a uniform structure U, com- 
patible with the topology of E, for which ¥ is a Cauchy filter. 


Proof. Let a be any element of H and V’, an arbitrary neighborhood of a. 
Since a is not contiguous to ¥ there exists an Fe ¥ such that a is exterior 
to F. Let Wz be a neighborhood of a contained in V’, and having no point 
in common with F and let f(x) be the continuous function taking the-value 0 
at a and the value 1 outside W,; in particular f(z) = 1 for ve F. 


Let « > 0 be arbitrary. Since f(z) is continuous, to every x corresponds 
a neighborhood V,* such that ye V.t implies | f(z) —f(y)|<«. We shall 
| take V,¢ to be the set of points y for which | f(z) —f(y)|<« Put 
Vem U Vat X Vet. 
| We have FX FCY,. In fact, let be F. Then f(b) =1 and f(y) =1 for 
yeF. According to the definition of Vi we have FC Vifi.e, FX FC Ve. 
The family of the V., (e > 0), generates a family of entowrages for the 
: set FE. In fact, the V, are symmetric and the intersection of two V, is a Ve 
4 and contains the diagonal A. Moreover, Vey; © VessC Ve; for, the relation 
(t,y) © Vers is equivalent to the existence of a z such that re V,‘/* and 
ye V./*; in that case | f(x) —f(y)| <«/2; similarly (y,z) Vey, implies 
f(y) —f(2)| <«/23 whence | f(x) —f(z)| <e and eV. 


* Received January 15, 1948. 


OHIO UNIVERSITY 
LIBRARY 


a 


20 RAOUF DOSS. 


Finally, for «< 1/2, Ve(a) C Va and V,.(x) is a neighborhood of z. 
In fact, Ve(a) is the set of all x for which (a,z)eV,. For such an 2, 
| f(a) —f(x)| < 1; whence f(x) <1, re Wy and V.(a) C WaC 
On the other hand, let ye Vz‘; since xe we have (x,y) Ve; whence 
ye V.(z) and V,(x), which shows that the V.(z) are neighborhoods 

We have started from a fixed point a and a fixed neighborhood V’, of a 
(and a corresponding function f(2)) to obtain the filter generated by the 
family of the 

Ve = 


Consider the set ¢ of these filters obtained from all the points a of H and 
all the neighborhoods V’;. Let U be the filter generated by the union of the 
filters of ¢. WU is a filter of entourages for the set H. The sets 


where we write V.,“) instead of V.,{¢"’«), constitute a base for the filter U. 


We shall show that the topology induced by the uniform structure whose 
filter of entourages is U is exactly the topology of FE. In fact, for every z, 
V(x) is a neighborhood of x; moreover, given the arbitrary neighborhood Vs 
of a, there exists a V. ¢€U for which V,“ (a) C Va. 

We shall show, finally, that ¥ is a Cauchy filter for the uniform structure 
U, i.e. that for every Ve U there exists an element of ¥ small of order V. 
It is sufficient to consider a V of the form (1). If, therefore, F(),- - - F( 
are the elements of ¥ for which 


Fad Fa) C Ve,™, Fla) Ve, ™), 


then the set F= FF ()- --{) Fee § is small of order V, for it is small 
of order V,,{%,- - The lemma is now proved. 

We shall study now the uniformisable spaces which have one structure 
compatible with their topology. 


THEOREM. In order that the separated uniformisable space E have a 
unique structure, it is necessary and sufficient that of any two normally 
separable sets one at least be compact. 


Proof. Necessity. If E has a unique structure U, this must coincide 
with the uniform structure of Weil (see [1], p. 16) for which every continuous 
function taking values in a uniform space is uniformly continuous. Suppose 
that A and B are two closed, non-compact, normally separable sets and let 


f 

it 
a 
i 

ce 
Ww 
w 

at 

ex 

ar 
th 
is 


ON UNIFORM SPACES WITH A UNIQUE STRUCTURE. * 21 


f(x) be the function mentioned in the definition. Since A is not compact 
there exists on A a filter ¥ without contiguous point in A. § is also without 
contiguous point in F, for, the closures of the sets of ¥ in A and in £ are 
identical, since’ A is closed in Z. Similarly, there exists on B a filter & 
without contiguous point in #. The filter # generated by the sets H of 
the form H = F ) G, with Fe # and Ge J, is also without contiguous point. 
By the lemma, & is a Cauchy filter for the unique structure U. f(x) being 
uniformly continuous, the image of & by f must be the base of.a Cauchy 
filter in the space of the real numbers. But this is not so, for, the image of 
& by f has at least two contiguous points 0 and 1. The existence of the two 
non-compact sets A and B is thus impossible. 


Sufficiency. We shall show first, that if the condition of the theorem 
is satisfied then FE is precompact for each of its structures. In fact, if FE is 
not precompact for a structure U there exists an entourage Uo eU such that 
there is no finite covering of Z by sets small of order Up. (We shall consider 
only symmetric entowrages). This means that we can find an enumerable 
sequence 21, of points of such that we never have (Xn, tm) Uo 
for mn. Let U, be such that.U, O U,C Uy; we conclude that the sets 
U,(am), U1(an) are disjoint for Take U, such that U. © U2C 
The neighborhood of contains an open neighborhood G,, of and 
G, contains a closed neighborhood F), of a. Let fn(a) be the real continuous 
function such that fn(#,) =1 and fn(x) =0 for re CF, and let 


f(z) fon(2). 


Then f(a) is continuous. In fact, at a point we Gon, f(x) is continuous since 
it coincides with fon(x) in some neighborhood of 2 (Gzn is open and the Gon 
are disjoint). For any in particular for ce C LU Gon, U2(x) has points 


in common with one U.(%2n) at most. In fact, if, for a certain 22, and 2 
certain y we have ye and ye U2(x), then (Xn, x) U;. If the same 
were true for another 22» it would give (22m, 2) Ui, whence (Xen, 2m) Uo, 
which is excluded. Thus U.(xz) has points in common with one U2(22n) 
at most, say with Us(%on,). If ce CU Gon, x is exterior to Fen, and there 
exists a neighborhood H of x such that HC U2(a) and such that H and Fen, 
are disjoint. This neighborhood H meets no F2,. We conclude that in H 
the function f is always zero. Since H is a neighborhood of x the function f 
is continuous at 2. 


n=1 
n 


RAOUF DOSS. 


If we put 


B == {72, 2, ° Lon, ° 


then f(z) =0 for ze A and f(z) =1 for ze B. If we show that A and B 
are closed and non-compact this would contradict the assumption of the 
theorem and we should have proved that / is precompact for each of its 
structures. It is almost evident that A, for example, is closed, for if xe A, 
then, as before, U2(x) would have points in common with one U2(22n-1) 
at most and would contain one element of A at most; whence zvéA. 
Similarly, A is not compact, for, the sequence 2;,2%3,° * ‘n+, ** * has no 
accumulation point in A. 

Suppose now that / had two different structures U and W. It is 
impossible that = L(U) or E(W), (£(U) denotes the separated 
completed space of H for the structure U and we identify # with the every- 
where dense subspace of £(U) with which it is isomorphic). In fact, in such 
a case, H would be compact and would have one structure only. Hence £(U) 
and £(WU’) have each at least one point more than EF. It is impossible that 
E(U) and £(W’) have each one point only ( and o’ respectively) more 
than FE. In fact, in such a case, U and W would be identical. To see this, 
let (Vj) be the trace on F of the filter of neighborhoods of » in E(U) ; 
this filter has no contiguous point in EF but it must have at least one con- 
tiguous point in £(WU’) ; the only possible point is w’; since L(W’) is compact, 
we conclude that (V..) converges to o’ i.e. is finer than the trace (Vu) on F 
of the filter of neighborhoods of w’ in L(U’); in the same manner (V..) 
is finer than (V.). The two compact spaces L(U) and £(WU’) are thus 
homeomorphic, which implies that they are isomorphic; the equality of the 
two structures U and WU’ follows. ; 

Hence we must suppose that £(U), for example, has at least two points 
a and b more than FZ. Let V’, and V’, be two closed disjoint neighborhoods 
of a and b and let V, and V», respectively, be their traces on H. L(U) being 
compact, hence normal, there exists a real continuous function taking the 
value 0 on V’, and the value 1 on V’. Its restriction to H takes the value 
0 on V, and the value 1 on Vy. Vz, and Vz which are closed in £ are thus 
normally separable. If we show that neither V, nor V, is compact, this would 
contradict the assumption of the theorem and the existence of the two 
different structures U and W will be impossible. 

If V. were compact it would be closed in L(U). But we have aé Va 
and ae Va. To see that a is contiguous to V_ in £(U) it will be sufficient 


) 
Ss 
e 


ON UNIFORM SPACES WITH A UNIQUE STRUCTURE. 23 


to consider the neighborhoods V’, of a contained in V’g. All these neighbor- 
hoods meet FH since H is everywhere dense in E(U), so that all of them 
meet Va = V’, {| H. Hence Va cannot be compact and the theorem is proved. 


Remarks. 1. We deduce easily from what precedes that if H has a 
unique structure then H is locally compact. In fact we have seen that when 
E is not compact, its unique structure U must be such that L(W) has one 
point only, », more than FL. Let xe £ and let Vz be a closed neighborhood 
of in £(U), not containing is compact since is compact; 
its restriction to FZ is still V», which proves that 2 has a compact neighborhood. 


2. In the case of a metrizable space E the existence of a unique structure 
implies that E is compact. In fact, by a theorem of J. Dieudonné, [3], 
E has a uniform structure U for which it is complete: H=F(U). Since 
E(U) must be compact, then EF is compact. 


Farouk I UNIVERSITY, 
ALEXANDRIA, EGYPT. 


BIBLIOGRAPHY. 


[1] A. Weil, “Sur les espaces & structure uniforme et sur la topologie générale,” 
Actualités Scientifiques, No. 551, Hermann, Paris (1937). 

[2] N. Bourbaki, “ Topologie générale,” Chapters I and II, Actualités Scientifiques, 
No. 858, Hermann, Paris (1940). 

[3] J. Dieudonné, Comptes Rendus de Vv Académie des Sciences, Paris, vol. 209 (1939), 

p. 108. 


— 
Is 
g 
Le 
1S 
id 
vO 
Va 
nt 


ON THE ACCESSORY PARAMETERS OF A FUCHSIAN 
DIFFERENTIAL EQUATION.* 


By Zreev NEHARI. 


It was shown by Schwarz in his classical memoir on the hypergeometric 
series [1]+ that the ratio w= y,/y. of two linearly independent solutions 


y, and y»2 of the Fuchsian equation 


n-3 


| — , “a1 
(1) y=0 
aa II (z — ay) 
n-1 


(all constants real) effects the conformal mapping of the half-plane &{z} > 0 
onto a curvilinear polygon, composed of n circular arcs, in the w-plane. 
* 7% are the angles of the polygon, where a, is defined by 


, , , 
On = ny A = ny wn tay + Sa = n—2, 
p=1 


and the points z=, are mapped onto the vertices with angles ay. The 
n—3 real parameters A1,° * *,An-s—Ccalled by Klein the “accessory para- 
meters ” of equation (1)—correspond to n—3 geometrical -variables deter- 
mining a polygon with given angles in a unique manner (apart from a linear 
transformation of the w-plane). 

The object of this paper is to investigate the functional relationship 
between the geometrical variables defining the curvilinear polygon and the 
accessory parameters A;,° * *,An-3 Of equation (1). A number of particular 
cases of this problem were discussed—either with the help of Hilbert’s theory 
of integral equations or using Klein’s method of continuity—by Hilbert [2], 
Koenig [3], Klein [4], Hilb [5] and others. In all these cases the main 
geometrical condition imposed on the curvilinear polygon is the existence of 
an orthogonal circle, a condition fundamental in the theory of automorphic 
functions. Another case (n= 4, special values of the angles and an addi- 
tional geometrical condition) was recently treated by v. Koppenfels [6]. 

We shall begin with the discussion of the case n= 4 in which the 


* Received, July 1, 1947; revised January 21, 1948. 
1The numbers in brackets refer to the bibliography at the end of the paper. 


24 


( 

( 

I 

1 

i 

i 

0 


ACCESSORY PARAMETERS OF A DIFFERENTIAL EQUATION. 25 


required analysis is particularly simple but which nevertheless shows all the 
characteristic features of the general case. 


1. The case n= 4. Equation (1) reduces in this case to 


1—e 1—8, 1—y Az+B 


a 


with 


A= dO", 


and obvious changes of notation; B is the accessory parameter. (2) is some- 
times called “ Heun’s equation ” ; it has recently been studied—from a different 
point of view—by Svartholm [7] and Erdélyi [8]. 

The angles of the quadrilateral upon which &{z} > 0 is mapped by the 
ratio w(z) = 4:/y2 of two linearly independent solutions y, and yz of (2) 
are 7a, mB, ry, 78 respectively, where 8 = 8’ — 8”; the vertices are situated 
at the points w(a), w(b), w(c), w(«). In order to keep the discussion as 
simple as possible, we shall assume that 0 <a, B, y, <1, thus confining 
ourselves to polygons with corners pointing outwards. Unless a restriction 
of this kind is introduced, there exists quite a surprising variety of different 
types of curvilinear quadrilaterals, as was shown in detail by Schoenfliess [9] 
and Ihlenburg [10]. 

Before we proceed to formulate a statement describing the general func- 
tional dependence of the accessory parameter B on the geometry of the 
quadrilateral, we shall introduce a few notations needed in the sequel. The 
full circle, part of which coincides with the conformal image of the stretch 
a<z<_b, will be called Ka», and similarly for the other circles. We shall 
characterize the relative position of Ka, and K-» by the cross-ratio of the 
four points in which the two circles are cut by the straight line connecting 
their centers. As there are six possible values of this cross-ratio, we shall 
have to choose one particular value in order to make things definite. This 
will be done in the following way: 

Among these six values there are two, say 6 and 6*, which become infinite 
if one of the two circles Kav, Ke» reduces to a point. These values obviously 
satisfy 6-4 6* 1. It is further easy to see that in case Ka, and Ke, do not 
intersect, one of the cross-ratios is negative and the other > 1; if Kay and Kes 
intersect, both cross-ratios are positive, one of them—say 6@—satisfying 
0< 64 and the other }=6* <1. If Ka and Kez touch, the cross-ratios 
reduce to 0 and 1 respectively and if Ka, and K-. are orthogonal, both 6 and 


= | 


26 ZEEV NEHARI. 


6* reduce to $. In order to define the cross-ratio in question in a unique 
manner, we shall first assume that Ka, and K-, do not intersect. Then there 
are two possibilities: Either we can, by moving continuously the four vertices 
on the circles Kay and Ke» respectively and varying continuously the four 
angles, convert the given quadrilateral into another quadrilateral (likewise 
composed of four circular ares) with four right angles, or we cannot. If this 
continuous reduction to a “rectangle” is possible, we shall take the negative 
cross-ratio; in the other case we shall take the positive. If Ka, and Key 
intersect, there are two ways of detaching them from each other in a con- 
tinuous manner: Either the two circles will pass through a position in which 
they are orthogonal to each other, or they will not. If we take the second 
alternative and, while detaching the circles from each other, allow any con- 
tinuous change of the quadrilaterals which leaves its vertices on the respective 
circles, there are again two possibilities: Either the new quadrilateral thus 
obtained is continuously reducible to a “ rectangle” in the sense indicated 
above, or else it is not. In the first case we choose that value of the cross-ratio 
which is between 0 and $; in the latter case the value of the cross-ratio has 
to be taken between 4 and 1. 

We have thus defined the value of the cross-ratio in a unique manner 
in all possible cases (provided the angles are all positive and smaller than 7). 
This particular value will be denoted by 6,. The cross-ratio defined in the 
same manner with regard to Key and Kaa will be denoted by 62. . 

With the help of these notations, the general result we shall establish 


may be stated as follows: 


The ratio w(z) = y:/Yy2 of two linearly independent solutions of (2) maps 
the half-plane &{z} >0 conformally on a curvilinear quadrilateral with 
angles 1a, 7B, ry, 7d. If the cross-ratio 6, is defined as above and the 


parameter 1s introduced by 
A= 4B+ 

(3) +2(1—a—f—y){(aa + + cy) 
+ aa? + + + 

then d is a root of the equation 


(4) = 41, 


where D(X) is the absolutely converging determinant | Anv| defined by 


‘ 
ir 


ACCESSORY PARAMETERS OF A DIFFERENTIAL EQUATION. 


n?(1 pace q?n-?v) 


NEV 


n= 0 
(5) Ann = 1— + (1 — —8*)} 


Aoo = + (1 — a? — — y? — 8). 


Here, and real, purely imaginary, &{w2/o,} >0) are the 
periods of Weierstrass’ @-function defined by 


(uw) = (u) — ex} {P — en} (u) — es} 


with 
e=a—(atb+e/3); e=—b—(a+b+c/3); 
és =c— (a+b+c/3); 


q denotes, as usual, exp {riw2/o,} and my, is the increment of Weierstrass’ 
t-function if 2w, is added to the argument. 
If 6, is given, is a root of 


(4a) D’(r) = 62, 
where is obtained from D(X) by interchanging and 


If neither 6, nor 62, but another invariant property of the quadrilateral 
is given—e. g., the existence of an orthogonal circle—this property can always 
be expressed by an identical relation between 6, and 62. If this relation is of 


the form 
F(6,, 62) = 0, 


then X 1s a root of the equation 
(6) F{D(A), D’(A)} =0. 


In all these cases, the equations for X have an infinity of roots, although 
not all of them necessarily real. To all real roots correspond quadrilaterals 
with the required properties. There is always one particular real root which 
gives rise to a quadrilateral not overlapping itself provided, of course, such a 
quadrilateral is geometrically possible. 


Before we prove this statement, we shall transform equation (2) by 
introducing elliptic functions, as in the usual treatment of Lamé’s equation. 
Writing 


27 


28 ZEEV NEHARI. 


(a+b+ 0/3), 
a—(a+b+c/3) b—(a+b+¢/3) =e, 
e—(a+b+c/3) =e; (e, + +e; =0), 


we set 


(dz,/du)? = 4(z, — (4, — e2) (41 — es), 
the elliptic function @ (uw) having one real and one purely imaginary period. 


We then write 


y =s(u)v(u) 


with a suitably chosen s(u) so as to make v(w) satisfy an equation of the 
form 


(7) v’ + r(u)v =0, 


the primes now denoting differentiations with respect to u. Since z,; = @ (wu) 
maps a rectangle on the half-plane &{2,} > 0 and 


Y1/Y2 = V1/V2, 
where v, and v, are two linearly independent solutions of (7), the function 
w= f(u) =v,(u) /vs(u) 


will map a rectangle on a curvilinear quadrangle, the corners of the rectangle 
corresponding to the vertices of the quadrangle. 
By carrying out the required calculations it is found that the exact form 


of (7) is 
(8) + (u) + (U—o1) + (4—7") 02) 
+ (u—es) +A}v=0, 


where A, the only constant remaining undetermined, is the accessory 
parameter. 

This parameter is, of course, not identical with the accessory parameter 
B of equation (2). By equating the first two coefficients in the expansions 
of the solutions of (2) and (8) in the vicinity of one of the singular points 
and remembering the transformation connecting z and u, A and B are found 


to be connected by (3). 
In order to remove the singularities from the real and imaginary axes— 


| 


ACCESSORY PARAMETERS OF A DIFFERENTIAL EQUATION. 29 


for reasons which will become clear presently—we make one last trans- 
formation, writing u—=t + (; + 2/2) and v(u) =7(t). (8) thus becomes 


(9) + ((4— 2°) @ (t + 02/2) + (t — 02/2) 
(4— (¢ + 1/2 — w2/2) + + + w2/2) 
+ = 0, 

which, for short, we shall write 

(9a) a” + (g(t) 


We shall now assume that the circles Ka, and K-~ do not intersect (the 
case in which they do will be dealt with later). In this case, we can find a 
linear substitution transforming Ka, and K- into two concentric circles with 
their common center at the origin. The radii of these two circles will be 
denoted by p and r (p <r). 

We then choose the two independent solutions 7; and 72 of (9) in such a 
way as to make the ratio 


(10) w(t) = n/n 


map the fundamental rectangle on the particular quadrilateral thus obtained. 


Two sucessive inversions of the rectangle along the two sides parallel to 
the imaginary axis will result in a point ¢ being shifted by 2o,; two corre- 
sponding inversions with respect to the two concentric circles of radii p and 
r respectively will transform w into (p/r)?w (or (r/p)?w, if the order of 
inversions is interchanged). By Schwarz’s symmetry principle, we have 
therefore 


(11) w(t + = (p/r)?w(t). 


On the other hand, g(t) having the period 2,, equation (9a) is unaffected 
by replacing by ¢+ 2;. Both 7,(¢-+ 20;) and y2(t + 2;,) will therefore 
also be solutions of (9a) and consequently be of the form 
mi(t + 20,) = Am (t) + By2(t) 
(t + 201) = Om (t) + 
with constant A, B, C, D. In view of (10) and (11) this yields 
Am (t) + (t) 
Cm (t) + Dya(t) 


This is only possible if we have 


= 


| 


30 ZEEV NEHARI. 


B=C=0 
A/D = (p/r)?. 
Now any two solutions of (9a) are connected by a relation 
m — (t) (t) = const. 
Substituting ¢ + 2,, for ¢, this leads to 
AD— BC =1, 

whence, in view of B=C=0, A/D = (p/r)’, 

D=+r/p. 


It is clear that although both the positive and negative values of A and 
D so far fulfill all the conditions, one set must be discarded in each particular 
case, as otherwise we would be led to four linearly independent solutions of a 
linear differential equation of the second order, which is absurd. In order 
not to interrupt the argument, we shall postpone the decision as to the correct 
sign to take, and shall, for the time being, carry both signs in the formulas. 

The two solutions singled out by our particular location of the quadri- 
lateral have thus the following transformation properties: 


m(t + 201) = + (p/r)m(¢) 
n2(t + 201) = + 


i.e. they are the well-known multiplicative solutions forming the subject of 
Floquet’s theory of differential equations with periodic coefficients. Their 
further discussion will accordingly be modelled on Hill’s treatment of his 
equation with periodic coefficients occurring in the lunar theory. 

If s is one of the solutions of 


(11a) 


= + p/T, 


the function 
e(s/2w) t 


is also multiplied by + p/r if t is replaced by ¢ + 21; m:(t) is therefore of 


the form 
(12) m(t) = th(t), 


where $(t) is a periodic function with the period 2, i.e. 


(t+ =¢4(t). 


ACCESSORY PARAMETERS OF A DIFFERENTIAL EQUATION. 31 


The function g(t) in (9a) is regular in the strip — &{w./2} < A{t} 
< &{w2/2} and the same is therefore true of any solution of (9a). Hence, 
both g(t) and $(¢)—having the period 2o,—may be expanded into Fourier 
series converging absolutely in the interior of this strip: 


g(t) == Cnet in/or) 


n=-00 


p(t) — Ane in/o) 
n=-00 


By (12), we have further 


m1 ( t) = Sane 
n=-00 


Inserting these expressions in (9a) and equating the coefficients of the powers 
of exp {(zi/w,)¢} to zero, we obtain the system of equations 


oo 
An{ (zin/w; + 8/20,)* + A+ co} + 
v=-00 


where the prime indicates that the term involving an has to be omitted from 
the summation. Writing for short 


is/2a =, + Co) =p, (w;/7) = dy, 
this becomes 


(n —o)?—p} — DY =0. 
v=-00 


In order that this system of homogeneous linear equations be consistent, its 
determinant must vanish provided, of course, it converges. Convergence 
being easily secured by dividing the n-th equation by (n—o)*—yp, o and p 
are thus connected by the equation 


D(o,p) =| Anv | =0, 
where the elements of the determinant | Anv| are defined by 
Anv = — dnv/(n—o)* —p, 
Ann = 1. 


The functional dependence of this determinant on the parameter o can 
easily found by an artifice going back to Hill [11]. Using his procedure, 
it is found that the equation D(o, ») =0 is equivalent to the equation 


D,(u) = sin? zo, 


—=_ 
— 


32 ZEEV NEHARI. 


where D,(u) denotes the determinant B,yv defined by 
Bry = — (n= v,n 0) 
Bun = 1— (n/n?) (n 0) 


Bov = 1° d_y, Boo = 
o having been defined by 


o = 18/2n, = + p/T, 
we have 

D;(u) =— (r—p)*/4pr, 
if s/r, or 


D,(u) = (r+ p)?/4pr = 1 — [— (r—p)?/4pr], 
if — s/r. 


The expressions — (r—p)?/4pr and (r+ p)*?/4pr have a simple geo- 
metrical interpretation. They are two values of the cross-ratio of the four 
points at which a straight line passing through the common center of the 
two circles—of radii p and r respectively—intersects these circles. It is 
further obvious that both these cross-ratios become infinite if either p or r 
reduce to zero. As these cross-ratios are invariant with respect to an arbitrary 
linear transformation, we can make our result independent of the particular 
location of the quadrilateral assumed in the course of the proof. Since—in 
this particular case— —(r—p)?/4pr or (r+ p)?/4pr each coincide with 
one of the cross-ratios 6 and 6* defined above, » will therefore always be a 
solution of 

D,(u) = 6 
or 
D,(p) = = 1—8. 


In order to decide which of these two equations is to be taken in a given 
case, we recall that the negative cross-ratio corresponds to the periodicity 
factor p/r while the positive one belongs to the factor — p/r. We now assume 
that the fundamental quadrilateral is continuously reducible to a “ rectangle ” 
in the manner described further above. This reduction will result in a, B, 
y, 6 all taking the value 4 and A tending continuously to a value Ay. (9) will 
then take the form , 

1” + don = 0. 


The two multiplicative solutions of this equation are 


| 


ACCESSORY PARAMETERS OF A DIFFERENTIAL EQUATION. 


i> = e-V-rot , 


q N2 
In view of 


A) must be negative, since the periodicity factor exp {201 V— Ao} is necessarily 
real (the two circles are assumed not to intersect). Hence both 7, and mp 
are multiplied by positive factors if 2w, is added to the argument. Now p 
and r were kept constant during our process of continuous reduction; more- 
over, the periodicity factors being continuous functions of all parameters 
entering the differential equation, a discontinuous jump from positive to 
negative values is impossible. Hence, the periodicity factors were positive 
from the beginning. We have thus proved that in case the quadrilateral is 
continuously reducible to a rectangle, the negative value of the cross-ratio 
has to be taken. On the other hand, it is easy to see that in case the 
periodicity factor is positive (and, consequently, the cross-ratio negative), 
such a reduction must always be possible. If it is impossible, we shall there- 
fore have to take the positive cross-ratio. Recalling our definition of the 
cross ratio 6;, » will accordingly in all cases be a solution of 


D,(u) = 4. 


Using the Fourier expansions of the @ functions constituting g(¢)—in 
(9a)—viz., 


9 (t+ (or + 02/2) =m/or— — 


n=-CO 


and the analogous expressions for p(t + o,/2 + o:/2), and remembering 
that pd, + (o:/r)*A, we have thus established formulas (4) and (5). 
(4a) follows in exactly the same way, the only difference being that parts 
of w, and ws have now to be interchanged and the solutions of (9) have to be 
considered in a strip parallel to the imaginary axis. 

So far, we have assumed that the pair of circles forming two opposite 
sides of the quadrilateral, e.g. Ka, and Ke», do not intersect. If they do, 
under an angle ¢«, say, the argument has to be but slightly modified. The 
linear transformation now to be applied to the quadrilateral will be such as 
to transform the two point of intersection of Kay and K-, into the points 0 
and oo respectively. Ky, and K, will then become two straight lines inter- 
secting at the origin under the angle «. Two successive inversions with 
respect to these two lines will result in a point w being tranformed into a 
point we*?*, where the positive sign may without loss of generality be taken. 
By Schwarz’s symmetry principle, we shall therefore have, similarly to (11), 


3 


| 

| 
| 

| 

| 


34 ZEEV NEHARI. 


w(t + = e7#w(t) 
and consequently 
m(t + 201) = + em, (t), 


no(t + 20,) = + (t), 


which again are the multiplicative solutions of Floquet’s theory. The rest 
of the argument can be repeated all over, the only difference being that 
instead of the cross-ratios @ and @* there now appears sin? «/2 or cos? «/2 
according as the periodicity factors are taken as + e*#€ or — e*¥€, 

Now sin? ¢/2 and cos*«/2 each coincide with one of the cross-ratios 0 
and 6* defined above. If the two circles are orthogonal, both cross-ratios 
reduce to $. If the two circles touch, the cross-ratios reduce to 0 and 1 
respectively. For reasons of continuity it is therefore clear if the correct 
value of the cross-ratio is between 0 and } and the two circles are detached 
from each other in a continuous manner without passing a position in which 
they are orthogonal, the cross-ratio must become negative; if the cross-ratio 
is between } and 1 and the circles are detached in such a manner, the cross- 
ratio will necessarily become > 1. It is therefore seen that the number 6, 
defined above will give in all cases the correct value of the cross-ratio. 

As shown in the foregoing, we are able to deduce a transcendental 
equation for the accessory parameter if one of the geometrical quantities 6, 
or 6. is given. This, however, will not always be the case. For instance, 
in the theory of automorphic functions, the fundamental cendition imposed 
on the quadrilateral is the existence of a circle orthogonal to all its sides. 
In the most general case, any condition invariant with respect to an arbitrary 
linear transformation may be given. 

Now a particular type of curvilinear quadrilateral with given angles and 
known “ modulus” (i. e. the cross-ratio of the points a, b, c, 0) is characterized 
by one and only one invariant condition (there is only one accessory para- 
meter). Accordingly, any invariant condition imposed upon it must be 
expressible as an identical relation between 6, and 6.; independence of 6, and 
6. would mean the possibility of imposing two independent invariant con- 
ditions. If this identical relation is of the form 


F(6,, 62) = 0, 
the accessory parameter d will be a solution of the equation 
F{D(a), D’(A)} =0. 


The exact form of F has to be determined separately in each particular case 


by elementary geometry. 


| 
le 
ti 
al 
| 
| 


se 


ACCESSORY PARAMETERS OF A DIFFERENTIAL EQUATION. 35 


As an example, we shall consider the case of a quadrilateral with zero 
angles and an orthogonal circle (this leads to an automorphic function with 
four “holes ”—the most immediate generalization of the elliptic modular 
function). We may assume that the orthogonal circle coincides with the real 
axis; this can always be brought about by a suitable linear transformation. 
Since all angles are equal to zero, all the vertices of the quadrilateral must 


‘ lie on the orthongonal circle, i.e. on the real axis, and the same will be true 


of the centers of Kav, Kue, Kea, Kua. Both cross ratios 6, and 6, will there- 
fore refer to the same four points, viz. the four vertices. 6, and 6, are, how- 
ever, not identical. By transforming each pair of circles into concentric ones, 
it is seen that we must have 6,6. 1. A will therefore be a solution of 


D(A)D’(A) =1. 


As shown by (6), D(A) takes a particularly simple form in this case 
(«= 8 = y = 8 = 0), all non-vanishing terms of the determinant being real. 
We now make. some remarks as to the general character of the solutions 
of equations of the type of (4), (4a), or (6). 
A determinant 


| + | (Snv = 0, n v3 = 1) 


being bounded by 


IL 


| Anv l), 
n=-OO 

it can easily be shown that D(A) is an integral function of » of order not 
exceeding $. Since an integral function of order <1 cannot have excep- 
tional values, equations (4) and (4a) will have an infinity of solutions. 
The same will be true of (6), as the relation F(6,, 6.) —=0—arising from 
elementary geometrical constructions—as algebraical and the equation deter- 
mining A can also in this case be written in the form P(A) =0 where P(A) 
is an integral function of order < }. 

In the general case not all of these solutions are necessarily real. Indeed, 
it may bé shown by elementary examples that an equation of the type 
D(A) = 86, may have only one real solution. 

To each real solution » there corresponds a pair of solutions of (9) 
leading to quadrilaterals with the prescribed angles and satisfying the addi- 
tional geometrical conditions. These solutions differ by the number of times 
and the particular manner in which the quadrilateral overlaps itself; a full 
discussion of the various types of quadrilateral obtainable in this way will 
be found in the papers of Klein and Koenig quoted above. There is one 


| 
| 
| 


36 ZEEV NEHARI. 


particular value of A leading to a quadrilateral in the ordinary sense, i. e., 
one not overlapping itself, if such a quadrilateral is geometrically possible. 


If there is an infinity of real solutions A, say Ao, * * = An); 
some indication as to their asymptotical behavior is given by the fact that 
| An | («> 0) 


converges, the values A, being the zeros of an integral function of order S 4. 


2. The general case. We shall now consider the functional relation- 
ship between the n —3 accessory parameters A,,° - -,An-3 appearing in (1), 
and a set of n—3 geometrical variables completely determining the curvi- 
linear polygon mapped by the ratio w(z) = y,/y2 of two linearly independent 
solutions of (1) onto the half-plane &{z} >0. This set of geometrical 
variables will consist of n — 3 cross-ratios of the same type as those employed 
in the case n = 4. 

In order to define the cross-ratios in question, we shall assume that the 
real numbers dy appearing in (1) are ordered by their magnitude, i. e., 


The stretch ay << z < SvS n—2) is mapped by w(z) y:/y2 on a 
circular arc forming part of a circle, say Ky; the ray dn <z< © is mapped 
on an are belonging to a circle Ky_,. 6 will then denote the cross-ratio of the 
four points in which the straight line connecting the centers of Ky and Kn1 
cuts these circles; the particular value of the cross-ratio to be chosen is 
defined in a similar way as in the case n = 4, the only difference being that 
the two sides of the reduced “ rectangle” not coinciding with Ky and Kn 
will be obtained by letting the other sides of the polygon coalesce into two 
circular arcs. 

As before, we shall not use equation (1) in its rational form, but 
transform it by the introduction of elliptic functions. However, unlike the 
case n = 4 in which one single transformation was sufficient, we shall have 
to apply to (2) n—8 different transformations, each featuring a different 
quadruple of singular points a which is being made to correspond by the 
elliptic function to the four corners of the rectangle of half-periods. 

Taking, e. g., G2, Gn1, 0, We write 


21+ (a, + + n1/3), 
— (A, + + = — (1, + + = 


(a, + dz + An1/3) = (e: + e2 +e, =0), 


| 
| 
| W 
\ al 
a] 
| i 
ol 

| 
fe 
| st 
th 
th 

(1 


ACCESSORY PARAMETERS OF A DIFFERENTIAL EQUATION. 37 


and set 
(u) 
(dz/du)* = 4(2, — e,) (4, — ez) (41 — és). 

We then write 

y= s(u)0(u), 
where s(u) is to be chosen in such a way as to make v(w) satisfy an equation 
of the form 
(13) v’ + r(u)v =0. 


The ratio w= v,(u)/v2(u) of two linearly independent solutions of this 
equation will map the rectangle [0, 0, 03, 2] (201, 202 being the periods of 
(wu), real, purely imaginary, w3 =; 2) on the curvilinear polygon, 
the four corners of the rectangle corresponding to the vertices of the polygon 
of indices 1, 2, n—1, n. 

Carrying out the necessary calculations, it is found that the function 
r(w) appearing in (13) is of the form 

(14) + (4 — (U— a2) + (4— an) (u) 


+S by) + a, 
where by is a solution of 


dy — + a2 + = @ (bv) 
and ¢(w) is Weierstrass’ ¢-function. The n—3 unspecified constants 
appearing in (13), viz., ms, * Pn-2, may easily be linearly expressed 
in terms of the constants :,° + *,An-3 in (2) by comparing the expansions 
of y(z) and v(u) and observing the formulas connecting z and uw. 

r(u) is a periodic function of u with period 2, and may therefore be 
expanded into a Fourier series converging absolutely in the interior of a 
strip, parallel to the real axis, in which r(w) is regular. This is certainly 
the case for 0 << A{u}< A{w.}. In order, however, to be able to operate 
on the real axis, we write 


thus replacing (13) by 
(15) + {9(t) + = 0. 


38 ZEEV NEHARI. 


The Fourier series 
g(t) = Cre (Tin/ws) t 
n=-0O 
will then converge absolutely for — &{w2/2} < X{t} < &{w2/2}. 
(15) can now be treated in exactly the same way as (9a). The equi- 


valent of (4) will then be an equation 


(16) D(p, * * = 8, 

where D(p, @n-2) is again an absolutely converging determinant of 
Hill’s type, built on the Fourier coefficients, cy of g(t). These coefficients cy 
are linear functions of the parameters ps, ws,° * *,@n-2 and can easily be 
written down explicitly with the help of (14) and the well-known Fourier 
expansions of the functions @ and ¢. As observed above, the numbers 


* *5 @n-2, are linear functions of the accessory parameters *, An-s 
of (2). (16) may therefore be replaced by an equation 
(17) * *, An-3) = 41. 


(17) gives a relation between the accessory parameters A;,° * -,An-3 and 
the geometrical variable 6,, all the other quantities entering (17) being 
expressible in terms of the singularities ay and the angles ray. As in the 


case n=4, it is easily seen that D(ys3,- + -,n-2,m) and, hence, also 
D,(Ax,* * *;An-3) is an integral function of order < $ in all its arguments. 
Accordingly, if 6, and n—4 of the accessory parameters, say A2,° * +, An-s; 


are known, there is a discrete infinity of values A, satisfying (17). 
If the process resulting in (17) is repeated with the quadruple a, dv,1, 
Qn1, 0(v<n—1) taking the place of a1, do, 0, We arrive at an 


analogous equation 
(18) Dy (Aa, ° An-s) = 6,. 


Since there are altogether n —3 equations (18), they will suffice to deter- 
mine the possible sets of accessory parameters when the cross-ratios 4) are 
known; as before, we are of course only interested in real solutions of (18), 
as only these give rise to curvilinear polygons in the ordinary sense. There 
may be an infinity of sets of real solutions of (18) but there is one and only 
one set leading to a polygon not overlapping itself, if such a configuration 


is geometrically possible. 
If instead of the cross-ratios 6,» another set of n —3 invariant conditions 


characterizing the curvilinear polygon is given, we proceed as follows: In 
addition to the quadruples of points dv, dv.1, Gn1, 0 Wwe also consider the 
sets 00, (1, Qk, (2k Sn—2) and define the cross-ratios 6’, related 


| 
| 
| 


ACCESSORY PARAMETERS OF A DIFFERENTIAL EQUATION. 39 


to these sets in the same way the cross-ratios 6, were defined with respect to 
vy, Avs1, An-1, ©. By the same procedure as before, we then arrive at a system 
of equations 

(19) An-3) = 2,- -,n—2) 


similar to (18). Now if a set of »—3 independent conditions determining 
the geometrical configuration of the polygon is given, it is clear that there 
must exist n—3 independent identical relations of the type 


(20) 025° * +5 = 0, 

as there cannot exist more than n—3 independent cross-ratios. The par- 
ticular form of the functions F;, has to be determined in each case by 
elementary geometry. By replacing, in (20), 6. and 6’, by the expressions 
(18) and (19), we thus arrive at n — 3 independent equations for the n — 3 
unknowns. 


THE HEBREW UNIVERSITY, 
JERUSALEM. 


BIBLIOGRAPHY. 


[1] H. Schwarz, Gesammelte Mathematische Abhandlungen II, Berlin (1890), pp. 
211-259. 

[2] D. Hilbert, “ Grundzuege einer allgemeinen Theorie der linearen Integralgleichungen, 
V,” Goettinger Nachrichten (1906), pp. 439-480. 

[3] R. Koenig, “ Anwendung der Integralgleichungen auf ein Problem der Theorie der 
automorphen Funktionen,” Mathematische Annalen, vol. 71 (1912), pp- 
206-213. 

[4] F. Klein, “ Bemerkungen zur Theorie der linearen Differentialgleichungen zweiter 
Ordnung,” Mathematische Annalen, vol. 64 (1907), pp. 175-196. 

[5] E. Hilb, “ Ueber Kleinsche Theoreme in der Theorie der linearen Differential- 
gleichungen,” Mathematische Annalen, vol. 66 (1909), pp. 215-257; 
Mathematische Annalen, vol. 68 (1910), pp. 24-74. 

[6] W. v. Koppenfels, “ Konforme Abbildung ausgezeichneter Kreisbogenvierecke,” 
S.-B. Math.-Nat. Abh. Bayer. Akad. Wiss. (1943), pp. 327-343. 

[7] N. Svartholm, “ Die Loesung der Fuchsschen Differentialgleichung zweiter Ordnung 
durch hypergeometrische Polynome,” Mathematische Annalen, vol. 116 
(1939), pp. 413-421. 

[8] A. Erdelyi, “The Fuchsian equation of second order with four singularities,” 
Duke Mathematical Journal, vol. 9 (1942), pp. 48-58. 

[8] A. Erdelyi, “Certain expansions of solutions of the Heun equation,” Quarterly 
Journal of Mathematics, Oxford Ser., vol. 15 (1944), pp. 62-69. 

[9] A. Schoenfliess, ‘“‘ Ueber Kreisbogendreiecke und Kreisbogenvierecke,” Mathe- 
matische Annalen vol. 44 (1894), pp. 105-124. 

[10] W. Ihlenburg, “ Ueber die gestaltlichen Verhaeltnisse der Kreisbogenvierecke,” 
Goettinger Nachrichten (1908), pp. 225-230. 
See e.g. Whittaker and Watson, Modern Analysis, 4th Ed., p. 415. 


ON THE MODE OF INCREASE OF THE PRODUCT OF BASIC SETS 
OF POLYNOMIALS.* 


By M. Nassir. 


1. The mode of increase of a basic set of polynomials is determined by 
its order * » and type y; thus a set of order w’ and type y’ will be of smaller 
increase than the above one if wo’ < » or if w =o and y <y. The aim of 
this note is to give an upper bound for the increase of the -product set of 
two basic sets of polynomials of given increase. Concerning this point of 
view, two results are already known. The first concerns the product of two 
simple sets of given order and gives an upper bound for the order of their 
product.” 

It states that if the simple sets {pn(z)} and {qn(z)} be of order w, and 
w2 respectively then the product set {wn(z)} of the sets {pn(z)} and {qn(z)} 
in the given order,’ is of order not exceeding o; + 2w2. However, to give a 
more complete solution of the problem in this case we must take into account 
the type of each constituent set as well as the type of their product. In fact 
it is here shown that the upper bound for the increase of the product set of 
two simple sets, is independent of the types of the constituent sets, provided 
that these types are finite. This result is demonstrated in the following 


THEOREM I. Let {pn(z)} and {qn(z)} be simple sets of polynomials 
of order w, and w. respectively and let each be of finite type. Then the 
product set {un(z)} 1s of increase not exceeding order w, + 2w2 type zero. 


The second result is given by J. M. Whittaker and is concerned with the 
product of two sets of which the inner set {qn(z)} is of the form {(Az-+ B)"}. 
In fact Whittaker * has shown that if {pn(z)} is of order w and type y then 
the product set {wn(z)} of {pn(z)} and {qn(z)}, as given above, is of order 
type y| A 

This result suggests the study of the increase of the product of two sets 


* Received October 22, 1947. 

1 J. M. Whittaker, Interpolatory Function Theory, Cambridge (1933), Ch. I, p. 11. 

2M. Nassif, “On the order of the product of simple sets of polynomials,” Pro- 
ceedings of the Mathematical and Physical Society of Egypt, vol III, 2 (1946), pp. 43-47. 

* We shall always take the set {u,(2)} to be the product set of the sets {P, (2) } 
and {1 (z)} in the given order, unless it is otherwise stated. 

4 Loc. cit. (1), Ch. I, p. 12. 


40 


( 
t 
1 
n 
n 
p 
= 


THE PRODUCT OF BASIC SETS OF POLYNOMIALS. 41 


the inner one of which is simple and of zero order. Let z" be expressed by 
the basic sets {pn(z)} and {qn(z)} in the forms: 


(1.1) an = pi (2), 

and 

(1. 2) a" = Aniqi(2). 
Write 


(RP) >> | | Ai(R), 


where | ~n(z)|, and denote by o,)(R) the corresponding 
2|=R 


expression for {qn(z)}, writing B,(R) for the maximum value of | gn(z)| 
in |2| SR. 

Now if {pn(z)} is simple then according to the result given in the 
above quoted paper, the order of the set {twn(z)} will not exceed that of 
{pn(z)}. In order to give a precise upper bound for the increase of the 
product set {wn(z)} in this case we have to take the type of {pn(z)} into 
account and, for the inner set {qn(z)} we consider the behaviour of 8°) (R), 
given by 


82) (R) = lim {wn'?) 


In fact we shall confine ourselves to the case where 8‘*)(R) is finite 
for finite values of R; and since {qn(z)} is simple and consequently 8°) (R)/R 
is monotonic decreasing, as given by Whittaker,> then 8°)(R) =O(R) as 
R tends to infinity. We shall further impose the condition 


(1. 3) 0<cSlim| gun | Slim | < 
noo n->00 
For the set {pn(z)} we shall assume that it is a Cannon set. Under these 
conditions the following theorem can be considered as a generalisation of 
Whittaker’s result : 


THEOREM IJ. Let {pnr(z)} be a Cannon set of polynomials of order w 
type y and let {qn(z)} be a simple set, satisfying (1.3), for which 8 (R) 
is finite for finite values of R. The product set {un(z)} will then be of increase 
not exceeding order w type where (R)/R. 


The problem is then considered for a wider class of basic sets of poly- 
nomials. It is however shown that no upper bound for the order of the 
product set is obtainable if the order of growth of the coefficients of the 


5 Series of Polynomials, Edited by Fouad I University, Cairo (1942), § 4, p. 13. 


— 


42 M. NASSIF. 


polynomials of the inner set {qn(z)} is unrestricted. In fact we impose the 


restriction : 
(1. 4) | | = O(n), for all 


where y is a positive finite number. Then the following theorem shows that, 
with this restriction, no generalisation can be affected in the class of constituent 
sets beyond the class for which D, = O(n), where Dn is the degree of the 
polynomial of highest degree in (1.1) and (1.2). 


THEOREM III. Let ow, and ow» be respectively the order of the basic sets 
{pn(z)} and {qn(z)} for which lim D,,/n = k, and kz respectively, and suppose 


further that {qn(z)} satisfies (1.4). Then the product set will be of order 
not exceeding how, + + phyke. 


2. Proof of Theorem I. Let f(z) =a,2" be an integral function of 
order 1/(, + 22) and type a, where a is any positive number. We shall 
prove that the set {w,(z)} represents f(z) in any finite region of the plane. 

Now since the set {n(z)} is of order o, and of finite type then we can 
choose a number k; > 1 and a positive integer n, such that 


(2.1) on™ (R) < (n>). 
Similarly two numbers k, > 1 and nz > n,; are chosen in such a way that 

(2. 2) | Ans | Bi(R) < on (R) < (n> nz; all 7). 
Also for a number a, > a, there exists a positive integer ns; > nz such that 

(2. 3) | dn | < {ae/n(w, + we) (n > ns). 


As f(z) is of order < 1/w, then according to Whittaker ° the set {qn(z)} 
represents f(z) in any finite region of the plane in a series of the form 
Sbngn(z), where by (2.2) and (2.3), bn» can be easily seen to satisfy 


(2. 4) | bn | Bn(R) < + > ns). 
Also since we can always take gn» to be unity then (2.2) implies that 
B,(R) < ken (n > nz). 


Hence if M,(R) is the maximum value of | w»(z)| in | z| S R, then, for 
R=1 and for n> nm, 


(2. 5) M,(R) < (n+1)8(R)An(R) kon", 


® Loc. cit. (1), Ch. I, p. 13. 


| 


ur 


THE PRODUCT OF BASIC SETS OF POLYNOMIALS. 43 


where S(R) is bounded. Writing ¢n(R) => | | Mi(P), then by (2.1), 
(2.4) and (2.5) we obtain, for n > and for = 1, 

| On| on(R) <2(m+1)T(R) (or + 
where 7(R) is bounded. Making n tend to infinity it follows that 


(2.6) fim {| bn | S (wr + Jor <1, 


for sufficiently large values of R. Since ¢,(f) is an increasing function of R, 
(2.6) holds for all positive values of R. Now as it is easy to deduce from 
the definition of the product set that qn(z) =i mniwi(z) then in view of 


(2.6) it can be easily shown that the set {un(z)} represents f(z) in any 
finite region of the plane. ; 

We now apply Cannon’s theorem? to deduce from this statement that 
the product set {wn(z)} will be of increase not exceeding order o; + 2, 
type 1/a. Finally since a can be taken as large as we please we conclude 
that the product set is of increase not exceeding order ; + 2w2 type zero, 
as required. 

The following example shows that the upper bound for the increase of 
the product set {w,(z)} as given by Theorem I is the best possible. 


Example I: Consider the basic sets {pn(z)} and {qn(z)} given by 


an(nt) + nlen*/(log n)" + 2” (n odd), 
pn(2) gn (n even). 
and 
1 (n= 0), 
Qn(2) = < + + gn (n even>0), 
2° (n odd), 


where a and b are any positive finite numbers. 


It is easily seen that {pn(z)} is of order 1 type a, and {qn(z)} is of order 


3 type 
Forming the product set {un(z)} we obtain 
1 (n= 0) 
ti (2) = (3b) + (mn!) /(log n)" + 2” (n even > 0) 
(logn)* (log n) "(log (n— 1) 
(log n)* +2". (n odd). 


7B. Cannon, “On the representation of integral functions by general basic series,” 
Mathematische Zeitschrift, Bd. 45(1939), p. 187. 


n->00 
| 
| 
| 


44 M. NASSIF. 


Let 
= ynitts(2), 
and write 
(2. 7) On(R) = > | -yne | Mi(R). 


Hence for the above set, it is easily seen that 
2(3b)8 (n —1) !}8 
(log n)” 
(log n)"{log (log ny* 


= 2a"n! + 


+ 


when n is odd, and 


2a"1(n!)3(n—1)! 2(3b)8(*) (n!)8{(n — 2) }8(n— 1)! 


2,(R) = 


(log ny" (log n)"{log (n— 1) 

4 2(n!)3(m— 1) (n— 2) 2(n!)3(n—1) | 
(log n)*{log (n— 1) }**{log (n—2)}"= (log n)*{log (n— 


(log 
when n is even >0. Hence the order Q of the product set is given by 


2 =lim log Qn(R)/n log n 


1 (3b) 3-2) — 2) 1}3(n —1) ! 
n log n log (log n)"{log (n—1)}* = 7, 
and since _— 


then the type of the product set is zero as given by the theorem. 


3. Proof of Theorem II. We take f(z) any integral function of order 
1/o type < Then numbers > y, ki > k, < and o, >¢ 
can be chosen in such a way that 


(3.1) 1/y1(¢1/k1) 


Corresponding to these numbers, positive integers n, and n.> mn, and 
a number R, exist so that 


(3. 2) on™ (R) < {nyiw/e}r (n>n,; all R>0) 
and 
(3. 3) | Ans | Bi(R) < (h,R)" (R > Ro, n> nz; all +) 


i 
| 


THE PRODUCT OF BASIC SETS OF POLYNOMIALS. 45 
As in Theorem I, {qn(z)} represents f(z) in any finite region of the 
plane in a series of the form 3Bngn(z), where Bn is such that 
(3.4) | Bu | Bu(R) < 
for R > Ry, and for sufficiently large values of n, for n > mM, say. 


Also, since the set {qn(z)} is simple then Ann = 1/qnn; thus in view of 
(1.3) and (3.3) an integer nz; > m2, corresponding to c, and a .positive 
number b < o, exists so that 


(3. 5) (c:R)" < Br(R) < (bk, (n>n3; R>R,). 
Let R, > R > Ro, then (3.5) yields, for n > ns, 


(3. 6) Mn(R) < +E} | por | Br(R) 


< An(bk,R,)S(R, 


where S(R,R,) is bounded. As usual the relations (3.2), (3.4), (3.5) 
and (3.6) are combined together to yield, for n > (mo, ns), 


| Bn | on(R) < 2T(R, Ri) (ory), 
where 7(R, R,) is bounded. In view of (3.1) we conclude that 
lim { | Bn | <1, 


and thus the proof of the theorem is completed as in Theorem I. 


An example is again constructed to show that the upper bound given 
in Theorem II is exact. 


Example II. Consider the basic sets {pn(z)} and {qn(z)} given by 


2" (n even), 
Pa(2) (4a)*"(n!)* + (n!)42"-*/ (log n)" + 2” (n odd), 

and 
1 (n=0), 
gn(z) 1 + 4 2” (n even > 0), 
cnyn (n odd), 


where a and ¢ are positive finite numbers and c < 1. 
It is easily seen that the set {pn(z)} is of order 4 type a, and, for the 


set {qn(z)}, 8 (R) = 2R for R= 1/2, and 


0<c=lim | gan < lim | gun | = 1. 


9 
n-0O 


46 M. NASSIF. 


The product set {wn(z)} is given by 


1 (n= 0), 
Un(z) = 2" (nm even >0), 
1\4 
(n odd). 


It is easily seen by the aid of (2.7) that, when n is odd, 


2(n!)4*Rr- 
c"(log n)” 


2(4a)*" 
O,(R) — 4)" 


and when n is even > 0, 


0,(R) = 


cnt 


c™ {log (n — 1) {log (n —1)}"" 


On calculating the order © and type I of the product set {wn(z)} we can 
easily see that Q = 4 and T—a(2/c)*/4, as given by Theorem II. 


4. The class of simple sets is the only class which admits an upper 
bound for the increase of the product of two sets of given increase, without 
restriction on the coefficients of the polynomials of the sets, apart from that 
governing the coefficient gnn. In fact if {qn(z)} is not simple and of positive 
order, then we can find instances in which the order of the product set is 
infinite, even if gnn 1, as it is seen in the following example: 


Example III. Consider the basic sets {pn(z)} and {qn(z)} given by 


(n=0), 
Pn(2) — 1. (n even > 0), 
(n odd), 
and 

(n=0), 

1 
Qn(z) = (n (n even > 0), 
(n=1), 
(n?+3n) (n oda > 1). 


We see that lim D,,/n = 2 for the set {pn(z)} and is equal to 1 for the 


| | 
( 
( 
WwW 


THE PRODUCT OF BASIC SETS OF POLYNOMIALS. 4? 


set {qn(z)} although this last set is not simple. We also observe that {pn(z)} 
is of order 1 and {q,(z)} is of order 3. Forming the product set {un(z)} 
we obtain 


2+ 2 (n 0), 
gn-l 
Un = (n > 
1 n=1), 
gn (n2+3n) (n odd > 1). 


It is easily seen that, when n is odd, 


2n3n 2n3"(n 1) (n+1) 


1 diel n 2n+3 
(n 1) (n+1) on + 3) 
Re. 


Thus the order of the product set is 


Q= lim logQ(k)/nlogn= o. 
odd 
Proof of Theorem III. The proof of this theorem is on the same lines 
as those of the former theorems; our integral function f(z) = Sanz" is taken 
to be of order p < 1/(k20: + 02 ++ kikz). We choose the numbers p; > p, 
p2 > 1, ps > o2, K, > k, and Kz > in such a way that 


(4.1) pi <1/(Kips + ps + 


Positive integers ni << m2 are chosen to satisfy the 
following inequalities: 


(4. 2) | @n | (n>m), 
(4. 3) on™ (R) < nvr (n> Nz), 
(4. 4) | Ang | Bi(R) < nes (n> ns; all 4), 
(4. 5) Dn < Kin (n>), 


where D, is defined for the set {pn(z)}, and, 
(4. 6) D'n < Ken (n> ns); 
where D’, is defined for the set {qn(z)}. 

Finally in view of (1.4) we choose the number a > 0, so that 


(4. 7) | qne | < anen (all n and all t). 


— 


48 M. NASSIF. 


Now if f(z) is represented by the set {qn(z)} by a series of the form 
SDngn(z), then we can deduce from (4.2), (4.4) and (4.6), that 


(4. 8) | bn | Bu(R) < 2/(n/K,— 1) (n > Kens). 
Also with the aid of (4.3), (4.5), (4.6) and (4.7) we obtain, for 
n>n; and for R= 1, 
(4.9) a(R) | | 
< a(Kin + 1) (Ki Kon + 1) 


Combining (4.8) and (4.9) we deduce that, in view of (4.1), 
(4. 10) Lim { | bn | Slime 


since we can always take B,(R) =1, for sufficiently large values of R. 
Since ¢,(F) is an increasing function of R then (4.10) holds for all positive 
values of R. The theorem is then completed in the usual way. 

The following example shows that the above result is a best possible one. 


Example IV. Consider the basic set {qn(z)} given by 


(n= 0), 
1 
Qn(z) = (n— (n even > 0), 
(n=1), 
gr + ningntt (n > 1). 


We see that | gni | Sn*", and lim D,,/n =1, so that p= 4 and k, = 1. 


The order of this set is easily seen to be equal to 3. Forming the product 
set {un(z)} of the set {pn(z)} given in Example III and the set {qn(z)} 


given above we obtain 


(2+2 (n= 0), 
gn-1 
Un(zZ) = 4 (n— 1)%¢ 1) (n even > 0), 
1+2 (n=1), 
gh (n odd > 1). 


Hence in view of (2.7), 2,(R) is given by 


0,(R) = 1/{(n— 1) — 1} [2R"4/(n— 4 ((n —1) 4 
Inn R2n+1 2n"(2n + 1) 


| 
_ 

| 


THE PRODUCT OF BASIC SETS OF POLYNOMIALS. 


when n is even > 2, and 


1) (n+1) (2n + 
1—n" 


R2nt8 


0,(R) = 


+2 


2n+4 


when n is odd > 1. On calculating the order © of the set {un(z)} we obtain 
Q = 12, 


a result which confirms with the upper bound given in the above theorem. 


Farouk I UNIVERSITY, 
ALEXANDRIA, EGYPT. 


a 


GAUSS SUMMABILITY OF TRIGONOMETRIC INTEGRALS.* 


By S. BocuHNer and K. CHANDRASEKHARAN. 


1. Introduction. In the case of an ordinary Fourier series 


(1) f(x) ~ ane? 
the partial sums 
(2) = = "(in (n + 4)t/sin $t) f(x + t) dt 
or the customary approximating sums like the Fejér sum 
3) (x)= 3, (sin 4nt/sin d 
( = {ein 4nt/sin 4t)*f(a + t)dt 


are finite integrals, and can be written down immediately. The problem 
of convergence arises only if we want to consider their limit as n—> oo. This 
is not the case, however with Fourier integrals for instance. If we introduce 


the Fourier transform 


(4) $(2) f (x) de 
then for the partial sum 25 
+R 
(5) Sn(z) — (a) da 
we have 
(6) — f (sin Rt/t) f(a + 
and for the Fejér sum | 
+R 
(7) on(2) f da 
-R 
we have 
(8) on(2) — f (sin + tat. 


The expressions (5) and (7) are finite integrals, but the expressions (6) and 
(8) already involve convergence requirements on f(t). The difficulties thus 
encountered may, however, be said to be due mainly to the “ behavior of f(z) 
at infinity.” In fact, the very process of introducing the transform (4) 


* Received January 6, 1948. 
50 


Th 


( 
i 
a 
( 
fc 
a 
al 
in 
la 
B 
ca 
re 
pel 


GAUSS SUMMABILITY OF TRIGONOMETRY INTEGRALS. 51 


requires some restrictions on f(2), and such restrictions are usually sufficient 
to give convergence and validity to the integrals (6) and (8) for every 
finite R. 

However, in somewhat different situations, there is another source of 
difficulty applying only to expressions of the type (6) or (8) and not at all 
to the analogues of the integrals (4) themselves, and these difficulties stem 
from the spectral structure proper of the function rather than only from a 
variable behavior at infinity. This phenomenon was encountered by Bochner 
[1,2] when generalizing the classical theory from periodic to almost periodic 
functions. If 


(9) f(t) 


is an almost periodic function in the sense of Bohr, in which case the Fourier 
coefficients are determined by the formula 


(10) a, = 1/2T f(t) e**tdt 

then for any given R > 0, there may or may not exist the ‘ Dirichlet-partial- 
sum,’ 

(11) Sr(r) ~ 5 


|An| SR 


in other words, there may or may not exist an almost periodic function Spr(2x) 
whose Fourier expansion is the series on the right of (11). However, there 
always does exist the Fejér sum? 

and it can be represented by the integral (8) which is absolutely convergent 
for any (bounded) almost periodic function. The integral (6) is obviously 
a ‘delicate’? one, since it is not absolutely convergent for a bounded function, 
and indeed as already stated, the partial sum (11) need not ‘ exist’ in cases 
in which the spectral point A+R or A>—R (or both) is an accumu- 
lation point of the Fourier exponents A, occurring in the expansion (9). 
But even if the function (11) does exist, as it very emphatically does in the 
case of a pure periodic function (1), its integral representation (6) still 
remains ‘ delicate,’ and the problem arises of securing its convergence and 


* This sum must not be confused with the so-called Bochner-Fejér sums for almost 
periodic functions, which are constructed in an entirely different manner. 


3 
| 
) 
) 


52 S. BOCHNER AND K. CHANDRASEKHARAN. 


validity in some appropriate manner. This problem was recently attacked 
by Bochner in [3], and it was shown that for pure periodic (1) and some 
other almost periodic functions (9), the formula (6) is valid in the following 
version 


(13) on(x) =lim ¢“"(sin Rt/t) f(a + t)dt 


€>0 
which amounts to a Gauss summability of the original version (6). Actually 
the result was stated for spherical summability of multiple Fourier series. 
If 


and 

=ny2+...+nk? 


0<t< aw, 


where o denotes the unit sphere &?—1, and dw; is the k—1 
dimensional volume-element (if = 1, =f(«+t) + 
then it was proven that 


(17) $°(R) lim (t) (UR) EW 
0 


(16) 


where c= 2°T(8+ 1), for all periodic functions f(z,,- --,a,) of class L, 
and some more general almost periodic functions, for all real 8 such that 


(18) $= 0, 
and furthermore for all complex 8 with 
(19) R(8) > 0. 


Here Ju(x) denotes the Bessel function of order ». See [8]. We recall that 


only for 
(20) 8 > (k—1)/2 


is the integral 
(21) S(R) (tR) dt 
0 
absolutely convergent, and relation (21) is the original version of a formula 


proved in [4]. 


( 
(; 

in 

(2 

(2: 

Tes 

for 

(2 


GAUSS SUMMABILITY OF TRIGONOMETRY INTEGRALS. 53 


We now state the object of the present paper. Chandrasekharan in 
[5,6] has pointed out that (21), being a Hankel transform, has formally 
an inverse, namely 


(22) fo(#) = 0 99(B) (BE) aR, 
0 
Its precise validity poses a question [7], and an answer is, for instance, that 
it holds for almost all ¢ provided that 
(23) > (k—1)/2. 


We proceed to show that if we interpret (22) as 
oo 
0 


then its validity is extensible to all values (18) and (19) of exponent 6. 
However, at present, some further assumption, different in nature, has to be 
added as regards the behavior of the function at the special point ¢ at which 
the validity of (24) is desired. We shall see that (24) is valid at the point 
t(0<t< o) if and only if we have there 


(25) ()ifo(t) — lim 1/e f 


and this is, in effect, a very mild continuity restriction on the function. 
Furthermore, we shall extend both (17) and (24) from the average fy(t) 
to its p-th mean defined by 


in which case, (17) and (22) are to be replaced by 


where 


and 


0 

respectively, where V,(2) =J,(x)/a*. See [5,6]. They will again be valid 

for (18) and (19) in conjunction with 

(29) R(p) > 0, 


54 S. BOCHNER AND K. CHANDRASEKHARAN. 


and 


that is, for all possible combinations. We likewise notice that the pair of 
formulas (27) and (28) are Hankel inversions of each other, and our 
reasoning will, in effect, give criteria for the Gauss summability of such 
formulas, under suitable assumptions specifically arising in the theory of 


Fourier series. 


2. The formula. We start from the formula 


oo 
(31) fot) — (te) dp 

0 
for all possible combination of p and 8 for which the right side converges 
absolutely. See [7]. Since for a fixed function - ¢ L we have 
(32) S°(p) = 0(p*) 


uniformly in all x, the right side of (31) converges absolutely for 
(33) p—8—k/2—4>1, 


and uniformly in every closed subset of this set (33) in the (p,8) plane. 


More precisely, we have, for ¢ > 0, 


* co fo) 
por | S°(p) | | (tp) | dp = f f 
0 0 1/t 
1/t oo 
(34) + O( j {8+8/2-p-k/2 / yp-8-k/2-8) dy) 
0 1/t 
== 0(1/t*). 
Now we introduce the ‘approximate partial sum’: 
oO 
0 


Since (33) implies that 
(36) p> 9, 


on the basis of the estimate (34), we can substitute (31) in (35), and 
interchange the integrations with respect to ¢ and p, thus obtaining 


fc 


of 


p=0 
_ 


GAUSS SUMMABILITY OF TRIGONOMETRY INTEGRALS. 


co 
S5«(R) f pk+2p-1-p-5- (k/2) fr(t) (tR) dt 


0 
co oo 
0 0 
0 0 


0 


on account of a formula in [8, p. 395]. Here Jy denotes the Bessel function 
of imaginary argument. 

Relation (37) is valid for the functions 8°(R), S®*(R) defined by (15) 
and (35), provided that p and 8 are subject to the conditions 


(38) p> 
(39) §> 0, 
(40) p—s8—k/2 — 3/2 > 0, 


of which (40) implies (38). 
Now restriction (40) can be removed. In fact, expression (15) is 


analytic in § for real part of 6 > 0; expression (26) is analytic in p for real 
part of p>0. Furthermore, it follows from [8, p. 48(4) ] 


(2) (p+ 8-4/2 + (1 — age, 
R(p+6-+ (k/2)) >—4, that we have the estimate ‘ 
| S c(A)el*l 
for any compact subset A of the open set 
R(p)>0, R(8) > 0, 
of the four-dimensional Euclidean space 


5 = 6, + 182, Pit 


| 
= 


56 S. BOCHNER AND K. CHANDRASEKHARAN. 


Therefore, the right side of (35) converges absolutely and uniformly in 
every such set A, and (35), as a function of (p,8) is complex analytic in A. 
Furthermore for such p and 4, the integral on the right of (37) converges 
in a similar manner, and is similarly analytic. Hence by analytic continua- 
tion, relation (37) also holds for (38) and (39) without (40). Finally, in 
all relations, it is admissible to let p | 0, and hence (37) also holds for 


(41) p=0, 
(42) 
actually it holds for 


(43) the point-set defined by R(p) > 0 and p=0O, 
(44) the point-set defined by R(8) >0 and §=0. 


Having thus established (37), we now proceed to prove that 


(45) lim S®*(R) = $°(R). 


Henceforward we shall consider only the case of real p and 8; the analysis 
can, however, be extended to the complex case with minor adjustments. 

For any fixed R, we estimate the integral in (37) by splitting it into 
three parts: 


The third integral in (46) is easy to estimate. For 0 << ee and p= 2R, 


we have 


(47) | (pR/2e*) | = O (pR)*) 


and hence 


GAUSS SUMMABILITY OF TRIGONOMETRY INTEGRALS. 


co oo 


=O (R&/?) +p-5-3 f 9) 


2R 
(48) “ 
2R/e 


=0(1), ase—> 0. 


To estimate the first integral in (46), we use | S°(R)| < cs, and consider 


the two subintervals 


(49) pk/é S 1, 
and 
(50) > 1. 


In (49) we have 
Ty.dsck/2) (pR/e2) = O(pR/e2) 
so that 
e/R 

(51) 
= OCR | S op) 
=0(1), 0. 


In (50) we again have estimate (47) and thus 


R/2 R/2 
€ 


e/R 2/R 
but here 
p-R/2€)? __ O ( 
and thus 
R/2 
(52) | | o0(1), as 


From (48), (51) and (52), we obtain the result 


S&<(R) 


2R 
—~ Co | S8(p) (pR/2e2) (1/22) do. 


R/2 


= 


58 S. BOCHNER AND K. CHANDRASEKHARAN. 


Now, in R/2 =p 2R, we use 
(53) Iy(pR/2e*) = (cs¢/(pR)*) +. O( (e2/pR) 


The error-term in (53) gives rise to 


2R 
O ( 1)1/R "dp, 


but for fixed R, this tends to zero as e—> 0, and thus can be neglected. Hence 


we are left with 


2R 


If we put y= — (k/2) —p+8-+4 we have 


2R 


[S**(R) — 8°(R)] — S°(R)] (p/R) 


But 
(p/R)¥ —1=0(|p—R)]) 


and hence 
2R 
R/2 


Since S°(#) is a continuous function of R for R(8) > 0, this implies (45). 
For §=0, 8°(R) = (S°(R + 0) + 8*(R— 0)) /2. 


3. The inverse formula. We want to establish the relation 


(54) fo(t) —lim (1R)AR 


0 


for certain values of t, for all R(p) 2 O,R(8) >0. For 


(55) 

we have 

(56) S°(R) = c,R*p dr. 
0 


See [5,6]. If we substitue this in the right side of (53) we obtain 


GAUSS SUMMABILITY OF TRIGONOMETRY INTEGRALS. 59 


oo (ore) 
0 
oo 
= Cab f (k/2) (+) (tR) J p154(k/2) (rR) dRdr 
0 0 
fo) 
0 0 
* a0 


0 


and by a similar argument as in section 2, this is equiconvergent with 


2t 
C3/€ f e-(t-1)/28F (7) dr. 
t/2 


PRINCETON UNIVERSITY 


| INSTITUTE FOR ADVANCED StTUupy. 


REFERENCES. 


1. 8. Bochner, “ Properties of Fourier series of almost periodic functions,” Proceedings 
of the London Mathematical Society (2), vol. 26 (1927), pp. 433-452. 


2. , “Uber Fourierreihen von fastperiodischen Funktionen,” Sitzungsberichte der 
Berliner Mathematische Gesellschaft, vol. 26 (1927), pp. 49-70. 

3. » “On spherical partial sums of multiple Fourier series,” To appear in Revista 
di Ciencias. 

4. , “Summation of multiple Fourier series by spherical means,” Transactions 


of the American Mathematical Society, vol. 40 (1936), pp. 175-207. 
5. K. Chandrasekharan, “On multiple Fourier series,” Proceedings of the Indian 
Academy of Sciences, vol. XXIV (1946), pp. 229-232. 


6. » “On the summation of multiple Fourier series I,” Proceedings of the London 
Mathematical Society (2), vol. 50 (1948), pp. 210-222. 
7. . “On Fourier series in several variables,” Annals of Mathematics, vol. 49 


(1948), pp. 991-1007. 
8. G. N. Watson, A treatise on Bessel functions, Cambridge, 1922. 


| 


NOTES ON FOURIER EXPANSIONS III.* 


(Fourier Stieltjes Series) 


By S. MINAKSHISUNDARAM. 


1, If F(x) is a function of bounded variation defined in the interval 
025 22, and if 


(1) dF (x) ~ $a) + (an cos nx + by sin nv) 


then we are aware of the following results on the summability of the Fourier 
Stieltjes Series on the right of (1): 


i) Almost everywhere the Fourier Stieltjes Series is summable (C,«), 
for every a> 0. 

ii) At a point where F(x) has a derivative the series is summable 
(C,1+ a) for every 

A generalization of this result to k-dimensional space will read as 


follows, if we make use of spherical summation: 


Let F(e) be an additive function of bounded variation defined on Borel 
sets in the interval (0 S 2, 2r) and let 
(2) dF (e) 2, Cn, . - 
2r 2r 
= (1/(27)*) f f . dF (e). 
0 0 
Then 
i) the series on the right of (2) called the Multiple Fourier Stieltjes 


Series, 1s almost everywhere summable (v*, (k —1/2) +.) for every e>0 
1. 


lim >> (1 (v?/R?) ) (k-1/2) y? = n,? + + 


SR? 
exists almost everywhere; 
ii) at a point where the symmetric derivative of F(e) exists the series 


on the right of (2) is summable (v*, (k + 1/2) +.) for every «> 0 to the 
symmetric derivative of F(e) 1. e. 


* Received June 12, 1947. 
60 


| 
~ 


NOTES ON FOURIER EXPANSIONS III. 


lim (1— (v?/R?) ) 1/2) Doom, F'(e). 

The main aim of this note is to prove these results. We prove them 
for the two dimensional case only for the sake of simplicity, but the method 
is quite general and can be easily applied to several dimensions. Incidentally 
we give a new proof of Bochner’s result on spherical summation of multiple 
Fourier series [1]. 


2. In the Euclidean z, y plane, let (e) denote Borel sets contained in 
the fundamental square (—$2,y4) and let eg» denote the set of 
points (+ 2,y-+y), y)ee. Further we denote by the interior of 
the circle 2*-++ y? =r? and s"¢,y the circular region with center (&,7) and 
radius 7. We further denote by F(e) an additive function of Borel sets (e) 
defined in —}=2,y=4, which is of bounded variation in this square. 
We note that this can be extended throughout the plane by the conditions 


F(e,1) =F(e), k,1 being integers. 
If at a point 


exists, then we say that F(e) has the symmetric derivative f(é,») at the 
point (7). It will be convenient to write 


F (sen) = 957) = $(7). 


It should be noted that, for every fixed €,», #(7) is of bounded variation in r. 
Finally we denote by C a positive constant independent of all the variables 
involved; though it may occur in different places, it does not necessarily 
take the same value. 


3. In addition to a few important properties of Bessel functions, we 
require a lemma due to Hardy and Landau and a transformation formula 
for theta functions, which will be made use of for obtaining a formula for 
spherical averages of the double Fourier series. The transformation formula is 


(3) g(u) g(u, 8) e7(m-n)?/® 


> 
and the lemma is 
Lemma 1. Jf K is the circle u?+ v?S R’, y(u,v) ts any continuous 
function, then 


62 S. MINAKSHISUNDARAM. 


where summation extends over the lattice points of K and the dash implies 
that the lattice points on the boundary are affected by a weight 4. 


For the proof cf. G. H. Hardy [2]. 
In this lemma let us take 


y(u, v) — 4? — vy?) i «> 0 
we have then 
(5) D> (RB? —p? — v?) 
lim 9(u)g(o) — dud. 


On the right of (5) we make use of the transformation formula (3) and 


obtain 
> (R? pe v7) (marry) 
= lim f, f (R? — > ((m-z) ut (n-y)v) dydy 
6-0 JK 
= lim (m?+n?) f f (R? v) ((m-x) ut (n-y)v) dyudy 
6-0 K 
(6) 


R 
lim f (R? — (m — 2x)? + (n—y)?}4tdt 
-+- 1) 252 (m2+n?) — 2)? + (n— y)*}3 


T(¢+1) 2x)? + (n— y)?}3 


the series in the last line being absolutely convergent for « > 4. If and y 
lie in the interval (— 3,4) we observe that 


(m —4)? + (n—})?S (m—z)? + (n—y)? S (m+ 3)? + + 2)’, 


so that 
1/(m— 2x)? + (n—y)? =O(1/m? + n?) if m?+ n?>0, 


and hence 

= O{1/R? ¥ 1/(m? + m2) 2a+3/4} 
= 0(R*). 


m?+n?2 > 0 


Si 


tl 


| 
| 


NOTES ON FOURIER EXPANSIONS III. 


So we can write (6) in the form 
| (7) > — p? py?) (ua+vy) 
= + O(Re4) 


uniformly in and y, if > 4 where 2? + y? 


4. Now let F(e) be an additive function of bounded variation; then 
from (7) 


8 > 


+ O(R*), 


Again, if 7 is a positive number, to be fixed later, 


ff dF (€,n) 0(R-). 


| so that (8) can be rewritten in the form 


+ 


Now consider the integral on the right of (9) viz. 


say. If we set 


= 
then 


1/R 


1/R 
| db(r) | = CRB (1/R) 


— 


64 S. MINAKSHISUNDARAM. 


1/R 1/R 


= — CR*@(1/R) + yan. 


If =o(r?) then given > 0 we can choose so that for rr 
| ®(r)| Ser’? 
and then 
| Js | Ce 
| Jo | <Ce+Ce+ (Ce/R™) f ‘(dr/r@*) < Ce. 
1/R 


Therefore we have 
J, +Jd2,=0(1) as 0. 


If (r) = + 0(1)) then we can show similarly that J; + Jz =s + 0(1) 
as R-> oo. Thus at all points where ®(r) =r*[s+0(1)] and therefore 
almost everywhere the series is summable (v?,a) «> #4. 


5. We shall now show that at a point where $(r) =o(r?) [or 
$(r) =1r?(s+0(1))] the Fourier Stieltjes series is summable (v?,a) for 
a% > 3/2. We have only to show that 


(9) 1/R* 49 (7) =0(1) as for a> 3/2. 
R 
On integrating the integral on the left by parts we obtain 
/r**)4(r)] —OR 2eRr) 
0 0 


= /7**) (7) —CR dr 


= J; + 


say. Let 
| o(r)| Ser? for 


Then 


| J; | < 


TR 
[Ia] dy < Ce if a > 3/2. 
0 


or, 


{ 
= 
| | 


NOTES ON FOURIER EXPANSIONS III. 65 


Hence (9) follows. If $(r) =7r?(s+0(1)), the right-side of (9) will be 
s-+o(1). It should be observed that ¢(r?) =1r?(s-+O(1)) means that 
F(e) has the symmetric derivative s/z at the point. 


6. We shall now use the formula (6) to obtain the fundamental formula 
of Bochner: Let f(é,7) be an integrable function periodic in each of the 
variables with period 1 and let us write 


g(a, y) 


so that the behaviour of f(é,7) in the neighborhood of (é,) is the same as 
the behaviour of g(x,y) near the origin. Further g(x,y) is also periodic 
and if 

F(E 0) ~ 


then 
= , 
Further let 


2 2r 
fen(t) = cos + sin = of g(t cos 0, sin 6) dé. 


Now multiply both sides of (6) by g(a,y) and integrate over the region 
§<2,y1. The left side will give us 
>> (R? pe =" v?) > (R? pe 
RP 
while the right side leads to 


+ 1 2x)? + (n— y)?}4) 
nt J, (n—y) y)dedy 


=f 


pita 


— u,n—u)dudv, 


or, if wu? 


pita 


a-1 


where is the rectangle m—1lS2,ySm. 


66 S. MINAKSHISUNDARAM. 


Hence we have the formula of Bochner [1] 


If, however, we multiply both sides of (7) by g(x,y) and integrate over the 


region —}=2,y=} we have 


(11) > — p? — v’) 


T(a+1)R*4* (3 (3 
= J, (2, y) dxdy 
+ O(R*) 
and if we observe that for any 7 > 0 
f g(x, = 0(R*), a>4 
then (11) leads to 
(12) +1)RM* C7 
fen(r) dr + O(R**) 


or what is the same thing 


(13) S(1— (v2 + ) 


a-1 0 


which proves the local property for summability of order ¢ > $. 


INSTITUTE FOR ADVANCED STupy, 
PRINCETON, N.J. 


REFERENCES. 


Trans- 


1. S. Bochner, “Summation of multiple Fourier series by spherical means, 
actions of the American Mathematical Society, vol. 40 (1936), pp. 175-207. 
2. G. H. Hardy, “ The lattice points of a circle,” Proceedings of the Royal Society of 


London, vol. 107 (1925), pp. 623-635. 


¢ 
| 
( 
a 
| ( 
St 
(: 
ho 
th 
the 
sat 
: 
whi 
of 


ON A LIAPOUNOFF CRITERION OF STABILITY.* 


dt, 
By GOrAN Bore. 
The following statements will be proved in this paper: 
If, in the differential equation 
(L) + = 9, 
the coefficient function is continuous, of period 7, d(«+7) =¢(2), 
and satisfies 
(1) J $(a)dx = 0 and $(x) #0 
0 
and 
| 


then (Li) is stable. These conditions are the “ best possible” in the following 
sense: There are unstable differential equations (Li) for which 
(1*) J (x) dz = —e and $(r) #0 

0 


and (2) hold, or for which (1) and 


T 

(2*) 
0 

hold, where > 0 is arbitrary. 


This result completes a criterion of Liapounoff for stability by replacing 
the Liapounoff condition ¢(x) = 0 and by (1). 


Proof. In order to prove the first assertion above, suppose, if possible, 
that (L) is unstable. Then there exists a real solution y= y(a) of (1) 
satisfying 


(3) +r) =sy(x) and y(x) 40, 


where s 40 is a real number. Clearly, y(x) has either no zeros or an infinity 
of zeros. In the latter case, the distance between adjacent zeros does not 


* Received April 21, 1948. 
67 


68 GORAN BORG. 


exceed +. It will be shown in (A) and (B) that assumptions (1) and (2) 
contradict both of these alternatives. 


(A) Suppose that y(a) does not vanish, then by (L) 


+ f = 0. 
0 0 


An integration by parts gives 
+ Jf yy +- f, = 0. 
0 0 
Since =ssty’(0)y7(0), it follows that either 


fede <0 or =0. 
0 


These two possibilities contradict the assumptions in (1). 


(B) Suppose that y(x) possesses zeros. It can then be supposed that 
0<b—aXrz, that y(a) = y(b) =0, and that y(z) > 0 on (a,b). It will 
be shown that these facts, in addition to the existence and continuity of the 
derivatives y’ and y”, imply? 


(4) > 4/@—a). 


In fact, it is clear that 


(5) | de> (yon)? max_ —¥ 
/a ast < 
But, if ymax = y(a+1,) = y(b—1.), where 1, +1, = b —a, then it follows 
from Rolle’s theorem that 
y' (E) = and — (4) = 
for some choice of and wherrea< b. Hence, 
from (5), 
b 
f | yy | dx > 1 = +) (ie) = 4/(b—a), 


in view of the arithmetic-geometric inequality. This proves (4). 


1 This inequality is due to Beurling. In my paper, “Ueber die Stabilitit gewisser 
Klassen von linearen Differentialgleichungen,” Arkiv fér Matematik, Astronomi och 
Fysik, vol. 31, no. 1 (1945), pp. 1-31, the inequality is generalized and used for the 
proofs of other, in most respects, more far-reaching stability criteria than the one above. 
However, the one above is not contained explicitly in these results. 


bu 


80 


P 
D 
y= 


ch 
he 


ON A LIAPOUNOFF CRITERION OF STABILITY. 69 


In view of b— az, (4) and (L) contradict (2) and, therefore, com- 
plete the proof of the first assertion of the italicized statement. 


(C) It will now be shown that (1*) and (2) do not imply the stability 
of (L). To this end, let dr >y>0 andk>0. Let y=y(zx) 21 bea 
function which has a period z, possesses a continuous second derivative, is 
identically 1 on the intervals (0,$2—~y) and (47-++ »,7), and has a deriva- 
tive satisfying | <k and y’S40. Define d(x) to be ; 
so that ¢(2) is continuous and of period z. Also 


f dz = — = — = — 


Hence, if « > 0 is given, (1*) is satisfied if k and y are chosen sufficiently 
small. Furthermore, (2) is satisfied whenever & is sufficiently small. 

It remains to show that (L) is of unstable type. Since the function 
y(x) is a periodic solution of (L), the equation (L) can be stable only if all 
solutions of (LL) are periodic. In this case, the solution y=2z(x) of (L) 
satisfying the initial condition z2(0) 0 (and 2’(0) =1) has an infinity 
of zeros. This leads to a contradiction, since the Sturm separation theorem 
implies that a non-trivial solution of (L) has at most one zero. Consequently, 
(L) is unstable. 


(D) It will now be shown that (1) and (2*) do not imply the stability 
of (L). To this end, let 44 >> 0 and let y= y(x) be an odd function of 
period 27, possessing a non-positive continuous second derivative on 0 


and, in addition, 
x for OS 
y (2) for dn + 


Define g(x) to be — y”y™ or 0 according as y40 or y=0. Then ¢(z) is 
a continuous function of period r. Also, (x) 20, so that (1) is satisfied. 


As above, 
| p(x) | dx = — f yy dz; 
0 


but the last expression is not greater than 
80 that (2*) is satisfied if » is sufficiently small. 


It remains to show that (L) is of unstable type. Since the function 
y=y(x) is a half-periodic solution of (LL), that is, y(x+7) =—y(z), 


70 GORAN BORu. 


the equation (L) can be stable only if all solutions of (L) are half-periodic. 
Suppose (L) is stable. Consider the solution y= 2z(z) of (L) satisfying 
the initial condition z(a) =0 and 2’(a) =m>0, where 
and 0<8<4r—y. Then z(a+7)—0 and 2’ (a+7)=—~m. Since 
¢(z) =0 for OS and S25 32/2 —y, it follows from 
(L) that z(z) is linear on these intervals, so that 


Also, since (x) = 0, it is seen from (L) that the solution z(x) is concave 
downward where z(x) >0. Hence, on the interval a< «<< a-+7, the graph 
of y= 2z(a) does not cross the line y= m(x—a). In particular, 


+ y) S m(2y + 8). 


If the arbitrary positive numbers », § are suitably chosen, the last two formula 
lines lead to a contradiction. This shows that (L) is unstable and completes 


the proof of the italicized statement. 
The statement in (D) was demonstrated in a different way by van Kampen 


and Wintner.? 


MATEMATISKA INSTITUTIONEN, 
UPPSALA, SWEDEN. 


2. R. vanKampen and A. Wintner, “On an absolute constant in the theory of 
variational stability,’ American Journal of Mathematics, vol. 59 (1937), pp. 270-274. 


| | 


C. 

3 


ve 


h 


la 
es 


on 


ON THE SPECTRA OF SLIGHTLY DISTURBED LINEAR 
OSCILLATORS.* 


By Puitie HARTMAN. 


The following theorem will be proved: 


TueorEM. Let q= q(t), where0St < ©, bea real-valued, continuous 
function satisfying 


(1) q(t) ~0 as 


Then the half-line 0 =A < © is in the spectrum of the eigenvalue problem 
determined by 


(2) (A+ 
by an arbitrary homogeneous boundary condition at t =0, say 
(30) cos 2’(0) sind= 0, 


and by the (L?)-condition at t= 0, 


(4) f << 0. 


That (2), (30) and (4) actually determine an eigenvalue problem is: 
assured by (1); in fact, the boundedness of g(t) from above is sufficient to 
this end; cf. [5], p. 252. It is known that the conclusion of the Theorem 
is valid if the assumption (1) is replaced by either the assumption 


(51) 
[5], p. 264; or the assumptions 
(52) f |dq|< and q>0ast>o, 


[8], p. 270, and Theorem (I), [9], p. 23; or the assumption 


(5s) Svat < 0, 


* Received June 3, 1948. 


| 
= 
| 
| 
4, 
71 


72 PHILIP HARTMAN. 


[3]. In [5], p. 264, the relation (1) as well as the condition (5,) is assumed, | 
but (1) is not needed in this situation. It is known that (5,) implies | 
that the half-line 0 =A < © is in the continuous spectrum and that no 
positive A is in the point spectrum of a boundary value problem (2), (36) 
and (4). The same holds if (5,) is replaced by (52), in virtue of the 
Theorem (I), [9], and the asymptotic formula [8], p. 270, for the solutions 
of (2) under the conditions (5.). The same assertions hold if (5) is 
replaced by 


(54) f tq*dt < 
[4], pp. 833-834. 

It remains undecided whether or not every A= 0 is in the continuous 
spectrum under the assumption (1). On the other hand, it is known that 
(1) is compatible with a positive eigenvalue; cf. the examples constructed in 
[6], pp. 394-395 or [8], pp. 268-269. 

The proof of the Theorem will be based on two different ideas; first, 


the constructions just referred to; second, the arguments used in [2] and 
[3], the latter being manifestations of the use of the Lebesgue-Toeplitz norm 
construction in [9], pp. 26-27, for locating points of the spectrum. | 

It will be convenient to deduce the theorem from several lemmas. 


Lemma 1. Let q=q(t), where OSt < be a continuous function 
satisfying (1). A given A-value is in the spectrum of the eigenvalue problem 
(2), (30) and (4), for every 6, if there exists a continuous function ¢ = ¢(t), 
where 0S t < with the property that the differential equation 


(6) + (A+ o)y=0 


possesses a solution y= y(t) satisfying 


(7) y— 0 and y’> 0 (t—> 
and 
(8) ©, 
0 

and, finally, 

oo oo 
(9) f < and f < 

0 0 


Proof. It can be assumed that there exists a value of 6 = 6*, where 


0 = 6* < x, determining a boundary condition (36*) corresponding to which 
the given A-value is an eigenvalue. For otherwise it follows from the oscilla- 
tion theorem of [2] that A is in the spectrum belonging to every boundary 


SPECTRA OF DISTURBED LINEAR OSCILLATORS. 73 


condition (30), where Let denote an eigenfunction 
belonging to A and 6*, so that «= 2*(t) 40 satisfies (2), (30*) and (4). 
Then (1) and (4) imply 


(10) and > 0, (t— 


ef. [7], p. 18. 

Suppose, if possible, that for some value 6, where 0 = 06 < 7, the given 
\ is not in the corresponding spectrum. Then it is known ([5], p. 251) that 
the inhomogeneous equation 


(11) (A+ 


has a unique solution z= XY (¢) satisfying (30) and (4) whenever g = g(t) 
is a continuous function of class (Z?) on 0St< a, 


(12) f < 
0 


Choose the function g to be (g—¢)y, where @ and y are the functions 
satisfying (6)-(9). Then (12) follows from (9), while (11) becomes 


(13) a’ + (A+ (q—¢)y. 


Let «= X = X(t) denote the solution of (13) satisfying (30) and (4). 
Then 


(14) X— 0 and X¥’>0 (t— 0); 


cf. [7], p. 13 (although the proof given loc. cit. is for an homogeneous 
equation, it is also applicable to the inhomogeneous equation (11) ) or ef. [1]. 

It is seen from (6) that = y(t) also is a solution of the inhomogeneous 
equation (13). Hence, the difference « = XY —-y is a solution of the homo- 
geneous equation (2). Furthermore, since X is, but y is not, of class (Z*), 
cf. (8), the difference ¥ —y is not. Accordingly, the solutions 7 = 2* and 
«= X — y are linearly independent. Consequently, (7), (10) and (14) lead 
to a contradiction, since the Wronskian of «* and X —y is a non-vanishing 
constant. Hence the proof of Lemma 1 is complete. 


LemMa 2. Let g=q(t), where 0OSt< , be a continuous function 
satisfying (1). Then there exists a function z=2(t), where OSt < 0, 
possessing a continuous second derivative and having the properties that 


(15) z>0 and 2’ <0, 


) 
t 
| 
n | 
| 
n 
oh 
ry 


PHILIP HARTMAN. 


(16) and (t> 
(17) 0, (t— «), 
oo 
(18) #a— but f < 0, 
0 0 
oo 
(19) f << and f < 
0 0 
and 
(20) f (a /z) cos 2tdt is convergent. 
0 


Proof. To simplify the construction, the function z defined below will be 
a “smooth” function, except for a sequence of “corners” tending to o. 
At a corner, the symbol 2’, #” will represent either of the limits 2’(¢ + 0), 
2’(t + 0), respectively. In view of the convexity of the function z’, cf. (21) 
below, these corners can obviously be removed without influencing the above 
relations. (For the application to be made in Lemma 3, it is unnecessary to 
carry out this smoothing process; cf. the remarks following (29) below.) 


Let «=a, where k —1,2,---, denote a sequence of numbers such 
that 
(i) >1, 
(ii) lq(t)| << W/kift>oa, 
(iii) tess > 


(so that the sequence a increases faster than a geometric progression), 


(iv) a, is an odd integral multiple of $z. 

Let 8B = Bx, where k =1,2,-- +, denote a sequence of numbers such that 
(v) < Be < Mears 

(vi) Bi (k—> 

(vii) fx is an odd integral multiple of 47. 


Since this lemma concerns large ¢-values, it will suffice to define the 
function z(t) on St< o. In terms of the numbers a, let z(t) 
be the positive continuous function determined by 


21) 2?(t) = if Bp StS and k—1,2,--- 
22(t) is linear if and k—2,3,---. 


| 


SPECTRA OF DISTURBED LINEAR OSCILLATORS. 75 


For the sake of simplicity, a, 8, A, B will be written in place of a, Bx, 
Ax, Bx, respectively. Then 


(22) 2°(a) =a and = B? 
and 

= At+ B (a<t<p), 
where 


A Ay — — 8*)/(B— a) —— (B+ 
Hence, by (vi), 
(24) —A~ke, (k— 
The function z(¢) is positive, decreasing and satisfies the first relation 
in (16). Also, 
(25) 2(2’/z) is either A/(At+ B) or — 2/1, 


according as ¢ (= ,) is or is not in an interval of the typeaS=tSf. In 
the first case, it is seen from (22), (23) and (24) that 


A/(At + B) = O(ka-*B?) = O(k*). 
Consequently, (17) holds. The second limit relation in (16) is implied by 
(17) and the first part of (16). 


There remains to verify the statements involving the integrals in (18), 
(19) and (20). It is seen from (22) and the second part of (21) that 


B 
(26) Jf = + (B—a) ~ (k > ©). 


Hence, the first relation in (18) follows from (vi), since 


B 
f = const. = o. 
k “a k 


On the other hand, 


ee) co 
f eas f f 
Bi Bx k a 


The first integral on the right converges in virtue of (1). From (ii), (vi) 
and (26), 


B B 
f f = O(k*). 


— 
_ 


76 PHILIP HARTMAN. 


Thus, the second part of (18) is verified. 
Similarly, it is seen from the first part of (21) that 


* 00 oo B 
f < f f 
By Bx k a 


and from (23) and (25) that 


B B B 
f = f A?(At+ B)“*dt = A log (At+ B)| 
a a a 
From (22) and (23), the last expression equals 
A log a*/B? ~ const. ka-* log a, 
in view of (24) and (vi). Since 
> ka-* log a < 0, 
k 
the first relation in (19) follows. From (25), 
(2’/2)’ = O(2'/z)’, 
as t—> 0. Hence, (17) shows that 


so that the second part of (19) is implied by its first part. 
In view of (25), 


00 B 
cos 2¢ dt = — f cos 2¢dt+ > ((2’/z) + t-*) cos 2¢ dt, 
Ba By k a 
provided that the series converges. Since 


B 
f t-* cos 2¢ dt = O(a"), 
it follows that 
B 
>| f t-1 cos dt | = O( Sa") < 
k k 


An integration by parts gives 


2 cos dt = (2’/z) sin 2t (2’/z)’ sin 2t dt. 
The first expression on the right vanishes, since a and £ are integral multiples 
of 47. From (25), 


| 
| 
t 
( 
al 
( 
Te 
(: 


SPECTRA OF DISTURBED LINEAR OSCILLATORS. 77 


B B 
—2f (2'/2)' sin 2t dt — “A*(At sin 2¢ dt 
a a 


if the new variable s — — (¢ + B/A) is introduced, the last integral becomes 
-a-B/A 
— f s* sin (2s + 2B/A)ds. 
-B-B/A 
From (22), (23) and (24), 
and 


by (vi). In view of the oscillatory character of sin (2s-+ 2B/A) and the 
monotony of 
-a-B/A 
f sin (2s + 2B/A)ds = O(6 + = O(k?). 
-B-B/A 


Consequently, 
S| (2/2) cos 2t dt| < @. 
k wa k 


This completes the proof of (20) and of Lemma 2. 


LemMA 3. If g=q(t), where 0=t< o, is a continuous function 
satisfying (1) and if X>0, then there exists a continuous function $(t) 
satisfying the conditions of Lemma 1. 


Proof. It can be supposed that A= 1, otherwise the unit of length on 
the t-axis is changed in the proportion 1 : A#. Let z= z(t) denote a function 
possessing a continuous second derivative and satisfying (15)-(20). In terms 
of z(t), define functions y= x(t) and ¢6=¢(t) by 


(27) x(#) = 2(2’/z) cost 
and 
(28) — $(t) =’ cos? t -+ x’ cos t— 3, sin f, 


respectively. Then ([{8], p. 268) 


(29) y= y(t) = exp( cos s ds) cost 


78 PHILIP HARTMAN. 


is a solution of the differential equation (6), where A=1. It remains to 
show that (7), (8) and (9) hold. 

It can be remarked that even if z is the function constructed in the 
proof of Lemma 2, so that z’, 2” are discontinuous at t = a, B,, the functions 
(27) and (28) are still continuous, while (29) possesses a continuous second 
derivative. This is a consequence of the fact that a, 8, are odd integral 


multiples of $z. 
In virtue of (27), the integral in the exp (_ ) factor in (29) is 


2 cost sds— f“(2'/2) ds + cos 2s ds. 


But (20) shows that the last integral tends to a finite limit as {> o. 
Hence, the local absolute continuity of log z implies 


t 
(30) exp ( f x(s) cossds) ~C z(t), (t— 0), 
0, 
where 
C = exp ( fe cos 2t dt) > 0. 
0 
Consequently, the first part of (7) follows from (29), (30) and the 

first part of (16). A differentiation of (29) gives 


’==yycost—exp(_) sin#, 


so that the second part of (7) is implied by the first part of (7), (27) and 
(17), (30) and the first part of (16). 
The monotony of z and the first part of (18) show that 


f cos*t dt 


so that (8) follows from (29) and (30). Also, the second part of (9) 
follows from (30) and the second part of (18). Finally, in view of (28), 
the first part of (9) will be verified if it is shown that 


(31) f x! < f < oo and f ‘edt < 


Since (17) implies that y>0 as t—> ©, the first of the relations (31) is a 
consequence of the last. On the other hand, the last’ part of (31) follows 
from (27) and the first part of (19). Finally, from (27), 


x’ = 2(2/z)’ cos t — 2(2’/z) sin t. 


— | 


SPECTRA OF DISTURBED LINEAR OSCILLATORS. 79 


Hence, the second relation in (31) is implied by (19). This completes the 
proof of Lemma 3. 

The Theorem now follows from Lemmas 1 and 8 and the fact that the 
spectrum is a closed set. 


THE JOHNS HOPKINS UNIVERSITY. 


REFERENCES. 


{1] P. Hartman, “The Z?-solutions of linear differential equations of second order,” 
Duke Mathematical Journal, vol. 14 (1947), pp. 323-326. 
and A. Wintner, “An oscillation theorem for continuous spectra,” Pro- 

ceedings of the National Academy of Sciences, vol. 33 (1947), pp. 376-379. 

{3] C. R. Putnam, “On the spectra of certain boundary value problems,” American 
Journal of Mathematics, vol. 71 (1949), pp. 109-111. 

[4] S. Wallach, “ On the location of spectra of differential equations,” American Journal 
of Mathematics, vol. 70 (1948), pp. 833-841. 

[5] H. Weyl, “Ueber gewéhnliche Differentialgleichungen mit Singularitaten und 
die zugehérigen Entwicklungen willkiirlicher Funktionen,” Mathematische 
Annalen, vol. 68 (1909), pp. 220-269. 

[6] A. Wintner, “‘ The adiabatic linear oscillator,” American Journal of Mathematics, 
vol. 68 (1946), pp. 385-397. 


[2] 


[7] , “ (L?)-conections between the potential and kinetic energies of linear 
systems,” ibid., vol. 69 (1947), pp. 5-13. 
[8] , “ Asymptotic integrations of the adiabatic oscillator,” ibid., vol. 69 (1947), 


pp. 251-272. 
[9] ———, “On the location of continuous spectra,” ibid., vol. 70 (1948), pp. 22-30. 


BEST APPROXIMATE INTEGRATION FORMULAS; 
BEST APPROXIMATION FORMULAS.* 


By ARTHUR SArD.* 


Introduction. A criterion for determining best formulas of approximate 
integration is proposed in Section 1. A criterion for determining best 
approximation formulas of a class of general type is proposed in Section 5. 

Particular best and nearly best integration formulas are given in 
Section 2. 


1. Best approximate integration formulas. Consider approximations 


of an integral 
b 
[ = f x(t) dt 


A =kyx(t,) + ++ ++ (ty), 


where +, kp; *, are constants independent of the function 
andaXt;=b,i—1,---,p. Suppose that A is such that the approximation 
is exact: A J, whenever x(t) is a polynomial in ¢ of cngeee n. The non- 
negative integer n is fixed throughout our discussion. 

Let @ be a class of such approximations A. We shall suggest a criterion 
for determining which (if any) approximation in @ is best. Our criterion 
will be relative to the integer n. 

Put 


of the form 


=I—A; 


R[2] is a linear (that is, additive and continuous) functional on the space Cy 
of functions continuous on 0b with norm 


|| || = max | 
ast=b 


Furthermore R[z] vanishes whenever x(¢) is a polynomial of degree n. It 
follows that there exists a function k(t) such that 


* Received January 16, 1948. 
1The author gratefully acknowledges financial support received from the Office of 


Naval Research, Navy Department. 


80 


( 
( 
( 
b 
( 
Sé 
CO 
to 
ele 
is 
P 
(1 
pre 
R[ 
(1. 
has 
“Tn 
PP. 
|_| 


BEST APPROXIMATE INTEGRATION FORMULAS. 81 


(1.1) R[z] = J (that, 


whenever 2(¢) is a function absolutely continuous n-th derivative. The 
function k(t) may be taken as the following: 


(1. 2) k(t’) = =— gr], 
0 if 
(1.3) { ift >t, 
! if isf 
(1. 4) pt = 0 if 


As a matter of fact &(t’) in (1.2) is an n-fold integral of a function of 
bounded variation.’ 

For each approximation in @ there is a function & and a representation 
(1.1) of the remainder R[2] corresponding to the approximation. We shall 
say that an approximation in @ is best if the integral 


b 
J f k2(t) dt 


corresponding to it is a minimum in the set of all such integrals corresponding 
to the elements of @. 

The use of the word “best” may be justified as follows. The best 
element of @ should be one for which R[x] is smallest, in some sense. If J 
is a minimum, that part of (1.1) which is independent of 2(¢) is small. 
Precisely put, Schwarz’s inequality implies that 


b 
(1.5) | REx] | f 
a 
providing that 2‘"*")(¢)* is integrable. Thus if (1.5) is used to appraise 


R[x], the best formula of @, as defined above, gives the least appraisal. 
Heretofore the appraisal of (1.1) by the theorem of the mean: 


(1.6) k(t) | dt sup | a(n+t) (¢) | 


has often been used. In many situations the appraisal (1.5) seems pre- 
ferable to (1.6); such situations are ones in which it is easier to estimate 


* The results of this paragraph are due to F. Riesz, Peano, and Rémés, Cf. A. Sard, 
“Integral representations of remainders,” Duke Mathematical Journal, vol. 15 (1948), 
pp. 333-345. 


6 


| 


82 ARTHUR SARD. 


a 


than sup | 


If the appraisal (1.6) is used, the least appraisal will be obtained from 
that approximation in @ (if any) for which 


b 
f | k(t)| dt 


is a minimum. Actually to determine such an approximation may be very 
complicated in practice; even in simple cases irrational values of the 
coefficients may be involved. 

In the past a procedure sometimes followed has been to choose that 
approximation in @ (if any) which is exact for a degree greater than n. 
Such an approximation permits an appraisal in terms of a higher derivative 
of x(t) than the (n+ 1)-th providing that x(t) possesses certain proper 
higher derivatives. However the function z(¢), or our knowledge of a(t), 
may be such that an appraisal in terms of the higher derivative may not be 
possible or useful. For example the higher derivative may not exist; or our 
knowledge of x(t) may be confined to a set of particular values of x(t), 
contaminated by error, from which we can estimate or guess at properties 
of the (n+ 1)-th derivative but not a higher derivative. Alternatively the 
higher derivative may be so large that an appraisal in terms of it will be less 
powerful that an appraisal in terms of the lower derivative. . 

As a precise example, suppose that 


(1. 7) [= f A=cr(0) + e,7(1) +: 


where Co, * are constants ; and suppose that consists of all formulas 
A of the form (1.7) which are exact whenever z(t) is a polynomial of 
degree 1. The Newton-Cotes formula is that formula of @ which is exact 
whenever a(t) is a polynomial of degree 6. The Newton-Cotes formula 
admits a representation (1.1) with n = 6, providing that x“ (¢) is absolutely 
continuous. But if a representation (1.1) with n—1 is to be used, the 
Newton-Cotes formula is neither convenient nor apt. Calculators, aware of 
this fact, have followed alternative procedures; for example they have used 
that formula of @ which results from an application of the trapezoidal rule 


to the integrals 


+, 


| 
| 
| 
| 
e 
a 
W. 
( ‘ 
J 
Ta 
eq 


BEST APPROXIMATE INTEGRATION FORMULAS. 83 


The particular formula thus obtained leads to an appraisal (1.5) which is 
more than twice the appraisal given by the best formula, in our sense.* 


2. Particular best integration formulas. In this section R[a] will 
stand for the remainder in the approximation of 


0 


by a linear combination of the m-+ 1 values of x(t) at ¢=0,1,---,m; 
the linear combination being such that the approximation is exact whenever 
a(t) is a polynomial of degree n. Put a —a(t). Then 


(2. 1) = (Coto + C14, ++ + 


0 


whenever a‘")(¢) is absolutely continuous. Here m and n are integers, 
m=1, n=03 Co, €m are constants; & is defined by (1.2). 

For fixed m and n, let Gmn be the class of all formulas (2.1). We shall 
show in the next section that either there is a unique best formula in Cm, or 
Qmn is empty. The latter case occurs if and only if m is less than the largest 
even number contained in n. 

Table 1 gives the coefficients of the best formulas in all 
cases n = 3, m =6 for which formulas exist. In each case Table 1 gives 
also the quantity 


(2. 2) J Fam 
0 


which enters in the appraisal (1.5). In each of the best formulas 


(2.3) Cm-i = 0, 1,- 


the best formula, ¢) = 41/104, c;=c;=118/104, c,=c,= 100/104, 
C= 106/104, and J = 77/6240 = .0123. 

For the iterated trapezoidal rule, =¢,;=—1/2, and 
J = 6/120 = .05. 

For the Newton-Cotes formula, = 41/140, c,=c;=216/140, 
= 27/140, c; = 272/140, and J = 27/350 = .077. 

The coefficients of the best formula and the corresponding value of J are given in 
Table 1, m=6, n=1. The comparative effectiveness of the best formula and the 
iterated trapezoidal rule is indicated by the square root of the ratio of the values of J, 
viz. [6/120 + 77/6240]% 2.01. The numerical values of J can be obtained from 
equation (3.6) with p=3,n=1. 


84 ARTHUR SARD. 


accordingly certain coefficients c; are not tabulated. The use of Table 1 is 
illustrated by the best formula cited in footnote 3. 

A computer using (2.1) may determine an appropriate value of n 
as follows. He may estimate (by means of differences or in some other way) 


the products 
0 


for different n, and then choose the value of n corresponding to the least 
estimate. 


TABLE 1. 


Best APPROXIMATE INTEGRATION FORMULAS. 


m CoA CoA A Jmn 
n=0 
1 1 2 1/12 = .083333 
2 1 2 2 2/12 = .166667 
3 1 2 2 3/12 = 25 
4 1 2 2 2 4/12 = .333333 
5 1 2 2 2 5/12 = .416667 
6 1 2 2 2 2 6/12 = 5 
n=] 
1 1 2 1/120 = .008333 
2 3 10 8 1/160 - = .00625 
3 4 11 10 1/120 = .008333 
+ 11 32 26 28 1/105 = .009524 
5 15 43 37 38 5/456 = .010965- 
6 41 118 100 106 104 77/6240 = .012340 
n= 2, 
2 1 4 3 1/1890 = .000529 
3 3 9 8 11/8960 = .001228 
4 21 76 46 60 11/12600 = .000873 
5 112 379 289 312 73/69888 = .001045- 
6 55 192 132 172 155 11/10850 = .001014 
n= 3. 
2 1 4 3 1/9072 = .000110 
3 3 9 8 13/17920 = .000725 
4 2349 9932 4430 7248 6557 /36529920 = .000179 
5 29392 110209 76819 86568  61633/193912320 = .000318 
6 1082811 4409946 2225043 4304484 3290014 210047/921203920 = .000228 


cl 
be 
th 


R* 


| 
is 
by 
of 
be: 
inc 
are 
int 


BEST APPROXIMATE INTEGRATION FORMULAS. 85 


TABLE 2. 
NEARLY BEST FORMULAS. 
Percentage 
by which J 
m CA CiA CA C3A A J exceeds Jmn, 
4 39 115 92 100 143/15000 = .009533 10 
5 40 112 98 100 11/1000 = .011 32 
6 40 112 98 100 100 31/2500 = .0124 49 
5 43 146 111 120 79/75600 = .001045 .04 
6 3855 1238 853 1108 1000 7097 /7000000 = .001014 .003 
4 97 412 182 300 10193/56700000 = .000180 15 
5 407 1529 1064 1200 =1154043/3628800000 = .000318 .06 
6 329 1341 675 1310 1000 31923/140000000 = .000228 .003 
5 81 307 212 240 46987/145152000 = .000324 1.85- 
6 33 134 67 132 100 1487 /6300000 = .000236 3.52 


In certain cases the coefficients c; of the best formulas are somewhat 
cumbersome. For this reason the “nearly best” formulas of Table 2 may 
be of interest. The formulas of Table 2 are of the form (2.1) ; in each case 


the formula is exact for degree n and 


feat 
0 


is close to the absolute minimum Jmn. Table 2 gives J and also the percentage 
by which J exceeds Jn. Since the appraisal (1.5) involves the square root 
of J, the extent to which the appraisal is increased by the use of a nearly 
best formula instead of the best formula is indicated by one half the percentage 
increase of J over Jmn. In two cases, two alternative nearly best formulas 
are given. 

The cited formulas transform to formulas of approximation of other 
integrals readily. Thus suppose that 


== at + B, a> 0, 
2*(t*) — a(t), 


R*[x*] f a* (t*) dt* (¢* 9) +. c*,2*(t*,) (tn) 
a 


> 
de®, 


86 ARTHUR SARD. 


where = a1 + B, b* =t*m, a* Then R*[2*] if 

in which case k*(¢*) = a"*k(t) and 


Thus best and nearly best formulas for 


f a* (t*) dt* 
a* 


can be obtained from Tables 1 and 2 respectively merely by using (2. 4). 
The following inequality holds whenever Jmn and Jmn exist (that is, 
whenever the classes (m,n, and GZ», are non-empty) : 


(2. 6) J min J men J my+mgn- 


For, a formula (2.1) with m =m, -+ m, can be constructed by combining 
(2.1) with m =m, and (2.1) with m = m, translated so as to refer to the 
integral from m, to m, + m2. By (2.5) the translation leaves Jim,» unchanged. 
Furthermore, J for the combined formula is the left side of (2.6). The 
inequality (2.6) follows, since Jmman is the minimum of all J with 
m= + Mo. 

The equality in (2.6) holds if and only if the coefficients c; obtained 
when the formulas are combined are precisely the coefficients of the best 
formula for m = m,-+ mz, since the minimum of J is taken on at a unique 


point. 


8. Derivation of the formulas of section 2. We shall consider the case 
of an odd number of ordinates: 


m = 2p. 


The case of an even number of ordinates is treated similarly. 
After a suitable translation of the t-axis, the remainder is of the form: 


where R[x] —0 whenever x(t) is a polynomial of degree n. The latter 
condition is satisfied if and only if —0 r=—t, r= 
z= t"; that is, if and only if 


| | 

| 

| 


er 


BEST APPROXIMATE INTEGRATION FORMULAS. 87 


(3.2) tbs) = (q+ 1), even, 
(3.3) =0, q odd, 


(3. 4) (ai + bi) = 2p—ce. 


Our problem is to determine (a;, b;,¢) such that the n +1 constraints 
(3.2), (3.3), (3.4) are satisfied and such that the integral 


J — (nat 


is a minimum, where & is given by (1.2). In the process of solution, we shall 
prove that a; = b;, as one expects. 

In order to calculate k(t’) we use (1.2) and (1.3) for t’ 20; (1.2) 
and (1.4) for ’=0. By (1.2) and (1.3), 


k(t’) = 
b(t’) = 41) — ay (p—1— */n}, 


k(t’) = + 1) !—a(p—)"/n! —a(p—1—?)"/n! 


H == 0p.) = 
0 


Squaring each of the above expressions for & and collecting like terms, we 
see that 


H = u— 2vidy + wijaia;, 


where a repeated index indicates a sum, i,j =0,1,---,p—1, and 
um 41) dt = p™*/(n + 1) !2(2n 4 8), 
t)"**(p—1t—t)"/(n+1) In! dt 
+ i)"*19"/(n +1) In! ds, 


= Wij = dt 
e” 0 


“(s + j —i)"s"/n!? ds, 


d 
t 
e Put 
| 


88 ARTHUR SARD. 


A similar calculation, starting with (1.2), (1.4), shows that 


0 
f k?(t’) dt’ = H(bo,* Bp-1); 


where H is precisely the same function as before. Hence 
(3. 5) J = H(a,- Gps) + H(bo,° 


Since c does not enter in (3.5), the constraint (3.4) merely serves to 
determine c in terms of (aj, };). 

We wish to minimize the quadratic function (3.5) subject to the 
constraints (3.2), (3.3). If the constraints are satisfied by (ai, bi), they 
are also satisfied by (bi,a;). Furthermore J is unchanged if a; and 6; are 
interchanged. Hence if J, subject to the constraints, assumes a minimum 
at a unique point (a;, b;), it must be true that aj = bi, t—0,1,---,p—1. 
(Otherwise interchanging a; and b; would give another minimizing point.) 

The constraints (3.2) (3.3) are either inconsistent or consistent. If 
inconsistent, the class of formulas Gmn is empty. If the constraints are 
consistent, they may be used to eliminate from J certain of the variables 
(a;,6;). After this elimination J will be a quadratic function of the 
remaining variables, considered as independent; and J will be always positive. 
The positiveness of J may be seen as follows. If J 0 for (ai, bi), then 
k(t) =0 almost everywhere. Hence R[x] —0, by (1.1), for all functions 
x(t) with absolutely continuous n-th derivative. But we can construct such 
a function z(t) which vanishes at integral points but whose integral is positive. 
For such an 2, R[x] #0. 

We now prove that J takes on its minimum at a unique point (ai, i). 
Suppose that g of the variables (a;,b;) are independent. If ¢—0, (ai, bi) 
is determined uniquely. If g > 0, denote the independent variables by 
Each y, is one of the variables (ai,b;). Then J =J(y) 
= u' —2v'-y, + w’rsYrYs, Say, Where repeated indices are summed and 
r,s=1,---,q. It will be sufficient to prove that the quadratic form ‘w’rsyrys 
is positive definite, for then J(y) will then have a unique critical point. 
Suppose the contrary. Then for some (y,-), say (yr), w’rsy’ry’s and 
y'ry’r 0. Now w’rsy’ry’s <0 is impossible, else J(Ay’) would be negative 
for sufficiently large A. Hence w’,sy’-y’s = 0 and J(Ay’) = wu’ — 2v’,y’,d for 
all real A. Since J is never negative it follows that v’-y’, = 0 and J(Ay’) =v’. 
But this is impossible. Let p be an integer such that y’p ~0. Let a(t) be 
a function which vanishes at integral values of ¢ other than the integral value 
corresponding to y’p as a coefficient in (3.1), and which takes on the value 


unity there. Then 


fu 


| 
| 
V 
C 
t 
a 
n 
us 


BEST APPROXIMATE INTEGRATION FORMULAS. 89 


= ‘cat — y’ pr 


for all A, by (3.1). This contradicts (1.5) since J = w’ is independent of d. 
Since a; = b; at the minimum, our problem is equivalent to the following: 
To minimize 
(3. 6) — 2H (do, a1) 


subject to the constraints 


(3. 7) (p —i) = p/(q +1), q even, 


For particular p and n, this problem can be solved directly; one may use 
the constraints to eliminate certain of the variables and then minimize the 
quadratie function of the remaining variables. For example, if n = 2, p = 2, 


(3.6) and (3.7) become 
J /2 = 32/63 — 16ao/9 — 37%a,/120 + 8497/5 + /60 + a,7/20, 
4a) + a, = 8/8. 


Eliminate a,: 
J /2 = 13/315 — %ao/30 + 


For a minimum, a@—7/20. Then a,—19/15, J =11/12600. These 
results are given in Table 1, n =2, m= 4. 

In deriving particular best formulas, an elementary fact about quadratic 
functions is useful: Suppose that fy + 2yizi + yijziz;, Where repeated 
indices are summed, y;j, yi, y are constants; and the z; are independent 
variables. At a point z; at which the differential df vanishes, f = y + yizi. 

Table 1, »=0, may be extended to all values of m quite easily: 
Co = Cm = 1/2, all other c; =1. A proof of this fact may be given along 
the following lines. First one proves the assertion for the case m even, 
working with the explicit function (3.6). Then one derives therefrom the 
assertion for m odd by using the inequality (2.6). In this way one avoids 


formal complications due to the constraints. 


4, Convergence. Consider a fixed n and a sequence m=—n, n+1, 
n+ 2,- of best approximations of 


using respectively m+ 1 equally spaced ordinates x(a), x(a + (b—a)/m), 
‘+ +,2(b), each approximation being exact for degree n. Let a(t) be a 
function with absolutely continuous n-th derivative for which a‘"*)(¢)? is 


| 
| 
| 


90 ARTHUR SARD. 


integrable. Let R[x] be the remainder in the approximation involving 
ordinates. Then 


b 
where 


(4. 1) = [ a) /m]?*** nn, 


by (1.5) and (2.5). An appraisal of the convergence of Jm to zero as 


m—> oc is the following: 
(4. 2) 0 Im Tun + Kn), 
where {wu} is the largest integer contained in wu and 

= max (Jan, * > Fon-1n) —dan. 


The relation (4.2) follows from (4.1) and (2.6). 
Thus R,,[2] is of order at most 1/m”*', 


5. Best approximation formulas. One may consider a far more general 
situation than that of the preceding paragraphs. 


Suppose that R[a] is any functional which is linear (additive and con- 
tinuous) on the space Cq of functions x = x(t) with continuous q-th derivative 
on aStSb, the norm being 


| max max | 2 (¢)|. 
4=0,1,....¢ aStSb 
Suppose that R[a] vanishes whenever x(t) is a polynomial of degree n= q. 
Then (1.1), (1.2) hold whenever x(¢) is a function with absolutely con- 
tinuous n-th derivative. Furthermore i in (1.2) is an (n — q)-fold integral 
of a function of bounded variation. 
We define the modulus M of R[a] by the relation: 


f k?(t) dt, 
K 


where K is the set in at b on which k(t) #0 and | K | is the measure 
of K. 


Suppose that ®@ is a class of functionals R[x] of the above type. We 
think of each R[x] as a remainder. We say that R is a best functional in 
the class @ if the modulus of R is a minimum among the moduli of all 
the elements of 

A justification of this definition is as follows. If «'"*)(¢)? is integrable, 


Schwarz’s inequality implies that 


“~~, 


BEST APPROXIMATE INTEGRATION FORMULAS. 91 


since the integral in (1.1) may be taken over K instead of [a,b]. (The 
case | K | is ruled out as trivial, for then R[x] —0.) The second 
member of (5.1) is the product of the modulus and the root-mean-square of 
2) (¢) on K. As our choice of the best functional is to be independent 
of the particular function z(t), we proceed as if the root-mean-squares of 
a("*1)(¢) on different sets K are comparable. The best functional, as just 
defined, gives the appraisal (5.1) with the least M. 

The definition of best approximate integration formula given in Section 1 
is consistent with the present definition. For in the cases considered in 
Section 1, &(¢) is piecewise a polynomial of degree n-+1. Accordingly K 
differs from [a,b] by a finite number of points, | K |= (b—a), and 
M = (b —a)*J*. Hence minimizing J is equivalent to minimizing M. 

As a particular example, suppose that we wish to approximate the 
derivative z’(1/4) in terms of a linear combination of certain of the values 


(5. 2) 2(— 2), 2(—1), (0), 2(1), 


that the approximation is to be exact for degree no, and that we will appraise 
the remainder by (5.1) with n=. The more of the values (5.2) that 
we use, the smaller becomes the integral in M*. At the same time, however, 
the larger | K | may become. If mo 1, the approximation using the fewest 
and closest values (5.2) seems to be best. If nm) = 2, the approximation 
using the fewest and closest values (5.2) is not the best. Details are 
omitted here. 

The functionals R[x] considered in this section may be the remainders 
in many approximating processes, including the following: Approximate 
integration other than that considered in Section 1, Approximate integration 
of Stieltjes integrals, Interpolation, Extrapolation, Approximate differentia- 
tion, Approximation summation, Approximations by polynomials according 
to least squares. 

Finally, suppose that @ is a class of linear operations R[x] on Cq to a 
function space, say, (y, the space of functions y—y(w) continuous on 
@<u<b. Suppose that each R[x] vanishes whenever x(t) is a poly- 
nomial of degree n=q. For each fixed u, then, R[z] has a modulus I. 
We may define a best operation (if any) in # as one which minimizes the 
integral of M? or, alternatively, the maximum of M?, over@lusS b. 


QUEENS COLLEGP. 


CHARACTERIZATION OF A FIELD BY A SINGLE OPERATION.* 


By S. Bororsky. 


1. Introduction. A field is a system consisting of a set of elements 
and two binary operations with properties roughly equivalent to the usual 
properties of addition and multiplication in ordinary algebra.t These 
operations are distinct, that is, the sum of elements a and b cannot always 
equal their product. They are not completely independent, however, since 
they are connected by the distributive law a(b-+c) =ab-+ac. It is not 
surprising, therefore, that a single operation can be found, consisting of some 
combination of addition and multiplication, in terms of which the two field 
operations can be expressed. 

N. Wiener? has shown that a field can be characterized by suitably 
ascribing properties to a single operation in such a way that this operation 
becomes the equivalent of 1— a/b in the field. If 6 0, however, the result 
of this operation does not exist, so that the field fails to be closed under the 
operation. R. J. Levit * showed that a field could also be characterized by 
another operation, expressible in the form a(1—b0), under which the field 
is closed. To avoid some complexity in the definition and properties of 
addition and subtraction, Levit demonstrates the sufficiency of his postulates 
by deducing Wiener’s from his own. However, merely by changing the 
primitive operation to a(b—1), using a different set of postulates, it is 
an extremely simple matter to define the field operations and deduce imme- 
diately the necessary properties. This is done in paragraph 2. 

In paragraph 3 we discuss other ways of defining the field operations. 
In paragraph 4 we show how certain single operations characterize subfields 


of a given field. 


2. Characterization of a field. Let S be a system consisting of a set 
of elements, at least two in number, and an operation aob. We assume: 


Al. For all a, b in § ac} exists and is a unique element of S. 


* Received August 8, 1947. ‘ 

1 See, for instance, A Survey of Modern Algebra by Birkhoff and MacLane. Note 
that a field must contain at least two elements. 

* Transactions of the American Mathematical Society, vol. 21 (1920), pp. 237-246. 

’ Transactions of the American Mathematical Society, vol. 57 (1945), pp. 426-40. 


92 


CHARACTERIZATION OF A FIELD BY A SINGLE OPERATION. 93 


A2. (a°b) (acc) ob. 
A3. If for all in S, then 


A4, There exists an element 0 such that (a°0) 0 (0°00) —a for all 
a,b in 8. 


Ad. There exists an element 2 such that corresponding to any a,b in S 
with a+ 2, there is an x for which aox—b. 


For simplicity in form, the statement of the final postulate is deferred 
until we have deduced some further properties of the operation. 


THEOREM 1. Jf acb—aoc, aX, then b—c. 
Let x be any element of 8. Then there is an element y such that aoy=—~2 


(A5). Then rob=(acy)ob 
—(acb)oy (A2) 
=(acc)oy (Hyp.) 
—(aoy)oc (A2) 
= 


Hence b=c (A3). 
CoroLuaRy. The x in Abd is unique. 
THEOREM 2. There is a unique 0 and a unique 2 and they are equal. 


Choose any 0 and any 2. Then there must be an element a such that 
a°cQ=42. For otherwise for every a we have a= (a°0)0°(002 ) by A4, 
so that every element equals 7)°(0°2)), contrary to the existence of at 
least two elements. 

Let a be so chosen that a00542,. Then, for any elements b,c we have 
(a°0)°(00b) = (a00)°(00c) by A4. Therefore, 0°b =00c (Theorem 
1) for all b,c. If Oz, this requires b—c (Theorem 1), which is 
impossible. 

Thus every 0 equals every 2, which establishes the desired result. 


THEOREM 3. 0°a=—0 for every a. 


We have already seen (preceding proof) that 0°b —0occ for all b,c. 
Letting c—0 we have 006000. Hence (0°b)00—(000)00. But 
(0°06) 00 = (000) 0b by A2. Therefore (000) = (000) 00. 

If 0°00, this requires b= 0 (Theorem 1) for all b, which is im- 
possible. Hence 0°00, so that 096 


94 Ss. BOROFSKY. 


THEOREM 4. There is a unique x such that aox=0 for all a. 
Let b ~0 be a fixed element. Let box=—0 (A5). 
If a is any element, let boy=a (A5). Then 
acx—=(boy) oz 
— (bor) oy (A2) 
0 ° y 
= 0 (Theorem 3) 
The uniqueness of such an z is immediate from A3. 
DEFINITION 1. The unique x of Theorem 4 we denote by 1. 
THEOREM 5. 
If 1 0, then for every a, b 
a= (a0l1)0(00b) (A4) 
=00°(00°0) (Theorem 4, Theorem 3) 
=( (Theorem 3) 
which is impossible. 
THEOREM 6.4 For every a,b with a1 there exists a unique x such 
that roa=b. 
Let c0. If coa=0 then coa~col (Theorem 4) so that a=1 
(Theorem 1). Hence coa+0. 


Let (coa) oy=b (A5). Letxv=coy. Then roa= (coy) oa 
= (cocoa) oy (A2) 
= b. 


Suppose we also have 2” oa=b. Let (Ad). Then 
=v 0a= oz (AQ). 


But b= (coa)oy. Hence (coa) oy= (coca) oz, so that (Theorem 
1). Therefore 2’ —coy=—zaz. 


DEFINITION 2. —a~—aod0. 
THEOREM 
THEOREM 8. — (—a) —a. 


Not needed in this paragraph; used in paragraph 3. 


al 


ar 


su 


= 


CHARACTERIZATION OF A FIELD BY A SINGLE OPERATION. 95 


For, — (—a) = (—a) °0 (Definition 2) 


CoROLLARY. 
THEOREM 9. 


For, 


= (a°0) 00 (Definition 2) 
= (a°00)°(00b) (Theorem 3) 
=a (A4). 
(—a)°0—a. 
(—a) 0b =— (acb). 
— (acb) =(acb) o (Definition 2) 
= (a°00) ob (A2) 
= (—a) ob (Definition 2). 


DEFINITION 3. ab = (—a) °((—1) 0D). 


THEOREM 10. 


For, al 


COROLLARY. 


THEOREM 11. 


al —a. 


= (—a)°((—1) 01) (Definition 3) 
= (—a)°0 (Theorem 4) 
=a (Corollary to Theorem 8). 


For, (—a)b = (—(—a)) ° ((—1) °bd) (Definition 3) 


THEOREM 12. 


For, ad 


and Oa 


THEOREM 13. 
ar = 


—[(—a)°((—1)°b)] (Theorem 9) 
= -—ab (Definition 3). 


ao = 0a = 0. 


= (—a)°((—1)0°0) (Definition 3) 
= (—a) °1 (Corollary to Theorem 8) 
=0 (Theorem 4), 


= (— 0) °((—1) ea) (Definition 3) 
(Theorem 9) 
—=-—0 (Theorem 3) 

= 0 (Theorem 7). 


For any a, b with a60 there is a unique x such that 


Since a = 0, therefore — a 0 (Theorems 7, 8). Let (—a) oy = 6b (A5). 


Since 10 
such that (—1) o 


(Theorem 5), therefore —1=40 and there exists an # 
a—=y (A5). Then ax = (—a) ((—1) 2) (Definition 3) 
= (—a)oy=—b. 


— 


96 S. BOROFSKY. 


If we also have az’ = b, then az = az’, so that 


(— a) ((—1) ox) = (—a) ((—1) °2”) (Definition 3) 
(—1) (—1) o2’ (Theorem 1) 
(Theorem 1). 


CoroLiary. Jf ab —0 thena=0 or b=0. 


For if a0 then a0—0 (Theorem 12) so that a0 and 
(Theorem 1). 


THEOREM 14. (a°b)c=—acob. 


For, (ac b)c=[— (a0b) 0 ((—1) °¢)] (Definition 3) 
= [(—a) 0b] o[(—1) oc] (Theorem 9) 
— [(—a) ((—1) oc) ]ob (A2) 
=acob (Definition 3). 


CoroLLaRy. (ab)c = (ac)b. 
In Theorem 14 replace a by —a and b by (—1)0°b. Then 


[(— a) ((—1) ob) Je = (—a)co ((—1) 05) 
(ab)c = (—ac) °((—1) °b) (Definition 3, Theorem 11) 
= (ac)b (Definition 3). 
DEFINITION 4. If a=40, then is the unique satisfying av = 1. 


DEFINITION 5. a+ b= (—b)0°(—ab") if 
tf b=0. 


THEOREM 15. a+0—0+a—a. 
For a= 0 it is immediate. For a0 


0+a=—= (—a)o(—0a") (Definition 5) 
= (—a) (Theorem 12) 
=a (Corollary to Theorem 8). 


THEOREM 16.. (—a) +a=—a-+ (—a) =0. 
For a=0 it is immediate. For a0 


(—~#) + a= (—a) °(—(—a)a*) (Definition 5) 
= (—a)o (Theorems 11, 8) 
= (—a) o1 (Definition 4) 
= 0 (Theorem 4). 


The second part of the theorem follows if we replace a by —a. 


CHARACTERIZATION OF A FIELD BY A SINGLE OPERATION. 97 


THEOREM 17. (b+ c)ha=ba-+ ca. 
If any of a, b,c is 0 the result is immediate. Suppose none is 0. Then 


(b+ c¢c)a=[(—c) (—be") Ja (Definition 5) 
= (Theorem 14) 
= (—ca)°(—bc") (Theorem 11) 
ba + ca = (— ca) © (— (ba) (ca)-*) (Definition 5; 
also by Corollary to Theorem 13 ca0). 


Let cv =b (Theorem 13). Then 


(ba) (ca) = [ (cx)a] (ca) 
= [(ca)x|(ca)“* (Corollary to Theorem 14) 
= [(ca)(ca)-*]x (Corollary to Theorem 14) 
= (Definition 4). 


Also bet = = (cce*)x (Corollary to Theorem 14) 
= 1z (Definition 4). 


Therefore, (b + c)a=ba- ca. 
We now introduce the final postulate, 
A6. (loa) +a. 
THEOREM 18. ab = ba. 


Taking b =1 in A6, (loa) +1—(1°1)+a 
=(0-+a (Theorem 4) 
=a (Theorem 15). 


But (loa) (—1) o[— (Definition 5) 
= (—1) o[—(1ea)1] (Definition 4, Theorem 13, 
Corollary to Theorem 10) 
= (—1)°[— (1ea)] (Theorem 10) 
= la (Definition 3). 
Thus la—a. 


Taking a=1 in Corollary to Theorem 14, (1b)¢=(1c)b, so that 
bc = cb. 


THEOREM 19. a(bc) = (ab)c. 


For, a(bc) = (bc)a (Theorem 18) 
= (ba)c (Corollary to Theorem 14) 
= (ab)c (Theorem 18). 


| v 


98 S. BOROFSKY. 


THEOREM 20. a+b—b-+<a. 


For a= 0 in A6, (100) + b= (10d) +0 
(—1) +0100 (Definition 2, Theorem 15). 


Also, (—x)(— y) =— [a(— y)] (Theorem 11) 
(Theorem 18) 
—=— (— yr) (Theorem 11) 
=yx (Theorem 8) 


so that (—1)(—1) 111 (Corollary to Theorem 10), from which 
(—1)?*=—1. 


We have, therefore, + (—1) = [— (— 1)] ° [— d(—1)7] 
(Definition 5) 
on (— 1) ] o[-——(—6)] 
(Theorems 18, 11, 10) 
=10°6 (Theorem 8) 


Thus, 6 + (—1) = (—1) + BD for every b. 


For a0 we have a+ b = (—1)(—a) + (— ba") (—a) 

(Theorems 18, 19; Definition 4; Theorem 10) 
= [(—1) + (—ba*)](—a) (Theorem 17) 
= (—ba*) (—a) + (—1)(—a) 

(Theorem 17) 

=b+a 

(Theorems 18, 19; Definition 4; Theorem 10). 


For a= 0 the result is immediate from Definition 5. 
THEOREM 21. (a+b) +c=a+ (b+ c). 


We have [a+ (—1)] + 0= (10a) + 0 (shown in Theorem 20) 
—(10b) +a (A6) 
= [b + (—1)] + a (shown in Theorem 20). 


For ¢ £0, (a+¢) +b—=[(—act + (~1)) + (—be)](—e) 
(Theorems 17, 18; Definition 4; Theorem 10) 
= [((—be*) + (—1)) + (—ace*)](— 2) 


(as just shown) 


+a 
(Theorems 17, 18; Definition 4; Theorem 10). 


ie 


0 
p 
i 
t 
0 
fi 
it 
D 
| 


CHARACTERIZATION OF A FIELD BY A SINGLE OPERATION. 99 


For c= 0 the desired result is obvious. 


Remark 1. The preceding theorems show that the system consisting of 
the elements of S and the operations of addition and multiplication, as defined, 
is a field. For the primitive operation we have 


a°b—1a0b (Theorems 18,10) 
= (Theorem 14) 
=a(10b) (Theorem 18) 
=a{b-+ (—1)] (shown in Theorem 20) 
=a(b—1). 


Conversely, it is obvious that if we start with a field, with addition and 
multiplication as primitive operations, and define a°b as a(b—1), then 
this operation satisfies A1l-A6. Moreover, if we then use ao 6 as a primitive 
operation and define addition and multiplication as in the preceding, then 
the two defined operations are exactly the same as the original operations. 


Remark 2. The operation a(1— )b) satisfies A1-A5; it also satisfies A6 
when and only when the field has characteristic two. Thus, A6 is independent 
of A1-A5. 


3. Other field operations. It is possible to define addition and multi- 
plication in terms of the primitive operation ao 6 differently from the way 
it was done in the preceding and still have the resulting system.turn out - 
to be a field. For convenience, let Fy denote the field defined in 2 and let 
operations performed in this field be symbolized by [ ]o. Let a, B be any 
fixed elements, a0. If we now define 


(1) 
ab = [a(a—B)(b—B) + Blo 


it is easily verified that the resulting system F is a field. In F the zero 
element for addition is 8; the negative of a is [—a-+2B]o, and a—hb 
=[(a—b+ ]o. Also, the unity element for multiplication in F is 
[8+ 1/a], and the inverse of a8 is [8B + 1/{a?(a—B)}]o. For B=0, 
@=1, F is identical with Fp. 

The definitions (1) express the operations of F in terms of those of Fo. 
Direct verification shows that we also have 


[a+ b],p—a+b—8 


(2) 
[ab]o—= y(a—8)(b—8) +8 


100 S. BOROFSKY. 


where 6 = 0, and y = [8 -+ (1/a?) ]o is a non-zero element of F. Conversely, 
from (2) we can obtain (1) with « =8-+ (1/y’), if we assume F to be a field. 

In terms of the operations of Fo, the primitive operation a° b is expressible 
in the form [a(b —1)]o = [ab—al] . Transforming to the operations of F 
by means of (2), we have 


(3) + 2%—a. 


It is natural to inquire whether (1) gives all possible ways of defining 
addition and multiplication in terms of the primitive. operation ao b so that 
the resulting system / is a field. A partial answer is given by the following: 


THEOREM. If addition and multiplication be defined in terms of ao} 
in such a way that the resulting system F is a field and so that acob is 
expressible in the form 


(4) = (A,ab + wat + (Acad + poa + + 02), 


where Xi, pi, vi, 04 are elements of F, then the operations of F are expressible 
in the form (1).° 


Suppose the conditions of the hypothesis satisfied. We show that ao} 
must be expressible in the form (3) and that, consequently, a+ 6 and ab 
are given by (1). 

To prove the second part first, suppose a°6 given by (3). Let 8 andi 
denote the zero and unity elements of F. We have 


(a) For if then aob = 28—a, contrary to A3. 
(b) 801=—8. Hence (Theorem 4). 
(c) ao(8+i/y) =0. Hence i/y (Theorem 4). 


(d) [—a]o—a00 (Definition 2). 
= 25—a 


(e) [ab]o = [—a]o° ([—1].°b) (Definition 3) 
= (28— a) 0 ((28—1) 0b) (part d) 
= (28—a) ((28—8—i/y) ob) (part c) 
= y(a— 8) (b —8) + 38 by applying (3) 


(f) for a0, [a], is the solution of [ax],—1, which is 
5 + i/{y?(a—8)} by parts e, ¢ 


5 Compare Levit, loc. cit.8, Theorem I. The result above differs in nature from 
Levit’s since here F is a particular field. 


| 
7 
( 
a 


CHARACTERIZATION OF A FIELD BY A SINGLE OPERATION. 101 


(g) for b0, [a+ b],=[—b].° [—ab"], (Definition 5) 
= (28— b) o (28— [ab].) (part d) 
= (23 —b) o (28— (y(a—8) ([b*]o—8) +8)) (part e) 
— (28 —b) (283—y(a—8) (8 + i/{y?(b —8)} —8) —8) 
(part f) 
= (28— b) o (8— {a— d}/{y(b —8)}) 
=a-+b—6 from (3). 
for b =0 we have [a+ 


Thus, we have equations (2) and, consequently, (1). 


We now show that if ao} has the form (4) it must have the form (3). 
Since a°b exists and is unique for all a,b, the denominator in (4) 
can never equal 8. This requires Az = Hence oo and 


acb=)dab+ pa+vb +o. 


If then (Theorem 3). This requires 
Thus —ypa-+a, contrary to A3. Therefore, B. 

Since ao z=b is uniquely solvable for x whenever a0 (Corollary to 
Theorem 1), therefore Aa + v, the coefficient of x, can equal B only when 
a=0Q. Therefore =— 

Similarly, since is uniquely solvable for whenever a1 
(Theorem 6), we must have »=—Al. 

From 0°b =0 (Theorem 3) we now obtains =0-+1A0. Thus 


ao b = dab — — OAD + 0 + 110 = A(a— 0) (b—1) 40. 


From (1°0)°0—1 (A4, Theorem 3) we have A?(1—0)*+0—1. 
This gives A(1—0) = +1. 


If A(1—0) =i, then 1=0+i/d and 
0) (b—0—i/A) + 0=A(a—0)(b—0) + 0+4+0—a 
which is the desired form (3). 
If A\(1— 0) = —i, then 1 and 
(5) + 0 =A(a— 0) (b— 0) +. 


In this case we now show that F has characteristic two so that this form is 
again the desired form. 


From (5) we have, as before, 


102 S. BOROFSKY. 


—i/r 
[—a]o>—a 
[ab]o =—A(a— 0) (b—0) +0 
[a-*]o = 0 + i/{A?(a — 0)} for 
[a+b], —b—a+0 for b~0 
=a for b=—0 
[(lea) +b], —=a+b—1 ford¥-0. 
Taking a=0, b in A6, [(1°0) + 
—0)(6—0) +1 
4-1. 


Thus ) —_1——0-+1 and F has characteristic two. 


4, Characterization of subfields. If we have a group under the opera- 
tion a:b, a non-empty subset § is a subgroup if and only if it contains the 
solution of x: a= b whenever it contains a and b. Since a field can also be 
characterized by a single operation, we inquire whether a subfield can be 
characterized by closure under the inverse of this operation. With a mild 
restriction this is true. 

Let F be a field and let a°b be given by (3), where 8 and y are any 
fixed elements of F and Let a A denote the solution of =a, 
so that 

(a—8)/{y(b —1} for 


Similarly let a V 6b denote the solution of bo —a, so that 
aV b=8-+ (a+ d—28)/{y(b for 


For convenience, we say a subset § of F' is closed under these operations 
if their results are in S whenever a and b are, provided b does not have the 
excluded values. 

If a subfield S contains § and § + 1/y, then it contains 1/y = (8 + 1/y) 
—§, so that y is also in §. Hence § is closed under the operation a A b. 
Conversely, if a subfield § is closed under a A b, then it must contain § and 
§+1/y. For, if §+ 1/y is neither 0 nor 1, then 


1/(y8 +1) = (040) —(1.A0) 
and 


— (1412) — @ 41) 


W 


il 
t] 

i 

| @ 

5 

ass 

Th 

ele 

fol] 

the 


18 


/) 


1d 


CHARACTERIZATION OF A FIELD BY A SINGLE OPERATION. 103 


are also in S, from which it follows that § and §+ 1/y are in 8S. If 8+ 1/y 
is 0 or 1, then the same result follows from this fact and the fact that one 
of the elements 1/(y8-+ 1), 1/{y(1— 8) —1} (whichever is appropriate) 
is in 8. 

A converse result is contained in the following: 


THEOREM. If subset S contains 0, 1, 8 and 8+ 1/y, and ts closed under 
the operation a A b, then S is a subfield. 


For a and 0 in § and 648+ 1/y, we have 


bAb=84+ (b—8}/{y(b—8) —1} 484 1/y 
af (bA b) =y(a— 8) (b—8S) + 2%2—a—aob. 


Since ao (6 + 1/y) = 83, therefore S is closed under ao b. 


Also [(8 + 1/y) (a A8)] A&=a-+1/y is in 8S. Hence for any a,b 
in § with a8, b A (a+1/y) =8-+ (b—8)/{y(a—8)} is in 8. By 
the preceding step, for a48, 6+ 1/y+ (b—85)/{y(a—8)} =b Va is 
in 8. 

It is now easily verified that the system consisting of S and the operation 
aob satisfies Al-A6. If we define [ab], and [a+ 6], by definitions 3 and 
5, we have 

[ab]. y(a—8)(b—8) +8 
[a+ 


With these two operations, S is a field, by 2. 
It follows, as in 3, that 


atb= [fa+b—0O], 
ab = [a(a—0)(b—0) + 0], where 1/y?. 


Since [1—0],—1-+ 8, therefore e148 is in Since [e+], 
=6-+1/y’, therefore a is in 8. 


Hence § is a subfield of F. 


Remark. To establish the conclusion of the theorem it is necessary to 
assume that the subset S contains all four of the elements 0, 1, 8, 8+ 1/y. 
That is, the fact that S is closed under a A b and contains three of these 
elements does not assure that it contains the fourth. This is shown by the 
following examples in each of which § is closed under a A } and contains 
the three elements indicated, but does not contain the fourth. 


| 
e | 
d | 
b. 


104 S. BOROFSKY. 


(a) F any field with at least three elements; 8 different from 0 and 1; 
y non-zero but otherwise arbitrary; S the set of elements of F different from 8. 
S contains 0, 1,8 + 1/y. 

(b) F any field of characteristic two with at least three elements; 5 = 0; 
y different from 0 and 1; S the subset consisting of 0 and 1/y. S contains 
0,8, 8 1/y. 

(c) F the field of rational functions of x with coefficients in a given 
field of characteristic three; 5=1; y—1/z; S the subset consisting of 
1,1+2,1—z. S contains 1,8,8+ 1/y. 

(d) FF the field of rational functions of the complex variable z; § = 0; 
y =z; S the set of rational functions with a finite value for z=0. S con- 
tains 0,1, 8. 


For the operation a Y b we have a similar result. 


THEOREM. Jf subset S contains 0,1, 8, and 8—1/y and is closed under 
the operation a VY b, then S§ ts a subfield. 


Let a and b be in S andb~8. Then 8V )=—8-+1/y is in SV. 


Also a V (8—1/y) =—a+1/y + 28. Replacing a in this by a V 3, 
we have 6+ (8—a)/{y(b—8)}. Replacing D in the last expression by 
—b+1/y+28 for )~8+1/y, we have 8+ (a—8)/{y(b —8) —1} 

By the preceding theorem, S is a subfield. 


Remark. As above, examples can be given to show that S may be closed 
under a Y b and contain three of the elements 0,1,8,8—1/y without con- 
taining the fourth. 


BROOKLYN COLLEGE. 


th 


( 
( 
t 
i 
| 
( 
( 
tl 
de 
tir 

at 
0; 
In 
(a 


ON A DECOMPOSITION INTO SINGULARITIES OF THETA- 
FUNCTIONS OF FRACTIONAL INDEX.* 


By AvurEL WINTNER. 


Introduction. The angular analogue of Cauchy’s symmetric stable dis- 
tributions leads to Fourier series which are formal generalizations of Jacobi’s 
elliptic #3-function, the latter being the angular analogue of the limiting 
case of a symmetric Gaussian distribution (for details, cf. [2]). The Fourier 
series in question are those given by (3) and (2) below. 

The object of this note is to determine the singularities of this function. 
This will be accomplished by developing the function into a series which puts 
the singularities into evidence. The series is of the Mittag-Leffler type but 
turns out to consist of branch points, instead of poles. The latter branch 
points are all logarithmic or all algebraic according as the “ fractional ” index 
is irrational or rational. 

The final result is stated at the end of the paper. 


1. If 
(1) Om 1 
and 
(2) 
then the series 
(3) 1+2 cos 


defines, for real x, an even function having the period 1 and possessing deriva- 
tives of arbitrarily high order. But the function cannot be regular-analytic 
at every real x, since (1) implies that the coefficients of (3) do not tend to 
0 as fast as the terms of a convergent geometric progression. In what follows, 
the singularities of this function of x will be determined. 

The corresponding question does not arise if (1) is replaced by A= 1. 
In fact, if A 1, then (3) becomes the elementary function 


(1 — q?)/(1 — 2q cos + 


(a Green function belonging to Laplace’s equation), while if A > 1, then the 


* Received April 10, 1948. 


r 

d 

1- 
105 


106 AUREL WINTNER. 


coefficients of (3) tend to 0 faster than the terms of any geometric progression, 
and so (3) becomes a transcendental entire function of z. 
In particular, if A = 2, then (3) becomes Jacobi’s elliptic 3; (a Green 
or “source ” function belonging to Fourier’s equation). 
2. Suppose (1) and put, if ow, 
oo 


(4) fa)— f e-* cos at dt 


0 


or (what, by Fourier’s inversion, is the same thing) 


(5) f f (4) cos at dt. 
0 
Then, as shown in [1], the expansion 
oo om, 
(6) f(t) = % 
m=0 
where 2”*1*4 > 0 and 
(7) (—1)™m !em =AT([m + 1]A) sin 4+ 1]d), 
holds for 
(8) 


The convergence of the series (6) at every point of the half-line (8) 
is part of the statement. Hence, if (6) is multiplied by 2 > 0, it follows 
that where x > 0, defines a transcendental entire function of z= 2". 

It should be mentioned for later use that, as r— 0, © 


(9) f(x) ~ ¢o/x* = O(a), where 


This is seen from (6) and (7), or rather from the first approximation to the 
proof of (6). 


3. For the present, let f(x) be thought of as being defined by (4) on 
the line 


(10) w, 


rather than just on the half-line 0<2< oo. Thus f(x) =f(—~7), and 80, 
by (9), 
(11) f(x) =O(|2|-*), where >1. 


Se 


(1 


the 
cor 


1 

a 

( 

di 

= 

if ( 

Test 

(14 

Cle: 
line 

(15) 
and 
(16) 


THETA-FUCTIONS OF FRACTIONAL INDEX. 107 


It is trivial from (11), where +2>0, and from (5), where x >0, 
that the conditions for the applicability of Poisson’s summation formula 


~ 

F(x+n) = & f F(t) 

n=—-00 n=-00 
-00 


are satisfied by F(z) —f(x) and also by F(x) =f(px), where p is any 


positive constant. This means that the series 


(12) Sf (px + pn) 


defines a function which has the period 1 and is represented by its Fourier 
series, which is 


oo 
oo 
n=-00 


Since f(t) —f(— 7), the integral occurring (13) is identical with 


af f(pt) cos dt = (2/p) f f(t) cos (2ant/p) dt, 


the constant p being positive. Hence, (5) shows that the series (13) can be 
contracted into 


oO 
2a/p) (1+ 2% cos 2xnz), 
n=1 
if g=q(p) is defined by — log = 


4. What this proves is that the Fourier series of the function (12) 
results if the series (3) is multiplied by the constant 27/p, where 


(14) p = 2x/(— log q)™. 


Clearly, (14) defines a one-to-one mapping of the interval (2) on the half- 
line 0 << p< o, which is the parameter range admitted in (12). 

In order to make applicable the expansion (6), let (12), where 
f(—t) =f(t), be written in the form 


(15) + 3 pe) + — pe)}, 
and let x be restricted to the interval 


(16) 1, 


oo 
0 0 


108 AUREL WINTNER. 


Since (15) or (3) has the period 1, the replacement of (10) by (16) involves 
no loss (except that x =0 is now excluded). 

Since (16) implies that both n+ 2 and n—a, hence both pn + nz and 
pn—nzx, are positive for n=1,2,---+, and since (6) holds under the 
proviso (8), the expansion (6) can be inserted into (15) under the proviso 
(16). Hence, the function (15) can be written in the form 


(17) f (pr) + (3 { (Cm/ (pn + px) Cm/ (pn — pr) m*s+1}) 


if x is on the interval (16). Finally, it is readily seen from (7), (1) and 
(6) that (17) can be rearranged into 27/p times 


0 n=0 1 


m= n= 


Accordingly, if A, q are on their respective ranges (1), (2) and if 
p= p(q,A) > 0 is defined by (14), then the function (3) of period 1 can, 
in terms of the constants (7), be represented in the form (18), where x and 
1 —z, hence every n+ 2 and n—=a, are positive and all exponents refer to 
the real-valued (i.e., positive) determination of the respective multi-valued 
functions. 

It is clear from the order of magnitude of the coefficients (7) that the 
expansion (18) of (3) is a “series in singularities,’ having the character 
announced in the Introduction. 


THE JOHNS HOPKINS UNIVERSITY. 


REFERENCES. 


[1] A. Wintner, “The singularities of Cauchy’s distributions,’ Duke Mathematical 
Journal, vol. 8 (1941), pp. 678-681. 
[2] - , “On the shape of the angular case of Cauchy’s distribution curves,” Annals 


of Mathematical Statistics, vol. 18 (1947), pp. 589-593. 


l 
f 
t! 
de 
Pp. 
de 
po 
(2 
th 
27 

ha 
(3 
Th 


ON THE SPECTRA OF CERTAIN BOUNDARY VALUE PROBLEMS.* 


By C. R. Putnam. 


1. In the differential equation 
(1) (A+ Qy=9, 


let X be a real parameter and let g=— q(x) be a real-valued, continuous 
function on the half-line 0S The differential equation (1) is in 
the Grenzpunktfall, in the terminology of Weyl [4], p. 238, if for some A 
(and hence for all A) the differential equation (1) possesses at least one 
solution y(z) not of class (L*), that is, 


oo 
f y? (x) dz = 0. 
0 


In this case, the equation (1) and a boundary condition 
(2a) y(0) cosa + y’(0) sina = 0, 


determine a boundary value problem for every fixed « It is known [4], 
p. 251, that a value A is not in the spectrum of the boundary value problem 
determined by (1) and (2,) if and only if for every continuous function 
f(z) on OS a< @ of class (LZ?), the inhomogeneous differential equation 


possesses a unique solution of class (Z?) satisfying the boundary condition 
(24). This characterization of the spectrum, together with an adaptation of 
the Lebesgue-Toeplitz construction, has been applied by Wintner [6], pp. 26- 
27, to the problem of locating points of the spectrum. It will be used below to 
prove the following 


THEOREM. Let g=—q(x) be a real-valued, continuous function on the 
half-line 0S a < © belonging to class (L?), that is, 


(3) 


Then the differential equation (1) is in the Grenzpunktfall and the half-line 


* Received May 8, 1948. 


109 


110 C. R. PUTNAM. 


A= 0 belongs to the spectrum of the boundary value problem determined by 
the differential equation (1) and an arbitrary boundary condition (24). 


If assumption (3) is replaced by 


(3 bis) dz em, 
0 


then the stattement above is implied b ythe theorem in [3], pp. 833-834, 
which asserts that, in this case, the half-line > 0 is in the continuous 
spectrum and contains no points of the point-spectrum. It remains undecided 
whether or not the corresponding statement concerning the continuous 
spectrum is valid under the milder restriction (3). That the corresponding 
statement concerning the point-spectrum is false is seen from the above 
Theorem and the example constructed in [5], pp. 394-395. 


2. Proof of theorem. The theorem in [1], $1, and the remarks 
following it loc. cit., § 2, imply that (1) is in the Grenzpunktfall in virtue of 
the assumption (3). 

It is therefore sufficient to prove that every positive value A is a cluster 
point of the spectrum of the boundary value problem determined by (1) and 
some fixed boundary condition (2,). For, according to [4], p. 251, the set 
of points of the spectrum consisting of the continuous spectrum and the set 
of cluster points of the point spectrum is independent of the boundary 
condition (24). Suppose, if possible, that a given A>O is not a cluster 
point of the spectrum for a boundary value problem. Since (1) is in the 
Grenzpunktfall, there exist two distinct boundary conditions (2g,) and (2q:) 
such that A is not an eigenvalue for either of the two boundary value problems 
determined by (1) and the boundary conditions (2q:) and (2a2), respectively. 

Let y=r,(x) and y=r.(x) denote linearly independent solutions of 
the differential equation 


(4) y+ Ay= 0 


satisfying the boundary conditions (2g,) and (2g2), respectively. In virtue 
of (3) and the fact that each function 7;,(x) is a linear combination of sin Mz 
and cos Adz, it follows that the functions q(2)rx(x) are continuous and of 
class (L?) for &=1,2. Clearly, the function r(x) is a solution of the 
inhomogeneous differential equation 


(5x) f+ (A+ Qy= (k =1,2), 


satisfying the boundary condition (2x). 


(1] 
[2] 
[3] 


[4] 


[5] 


[6] 


C 
( 
it 
he 
ey 
be 


f 


ON THE SPECTRA OF CERTAIN BOUNDARY VALUE PROBLEMS. 111 


Since A is not in the cluster spectrum and since A is not an eigenvalue 
for either of the two boundary value problems determined by (1) and (2ax), 
where k = 1,2, it follows that A is not in the spectrum for either of these 
boundary value problems. Hence, there exists a unique solution y = y;,(2) 
of (5x), of class (L*), satisfying the boundary condition (2,). Define the 
functions z(a2) by 


(6) = — (2), (k =1, 2). 


Then, for k= 1,2, is a solution of (1). The functions z(zx) are 
linearly independent. For suppose ¢,2, + ¢2%,==0 for some pair of constants 
c, and ¢2, it follows from (6) that cr, + cer is of class (Z*). Consequently, 
¢:= C2, =0, since the function y=0 is the only solution of (4) of class 
(L*), and 7; and r, are linearly independent. 

Since every solution y= y(a) of (1) is a linear combination of z,, 22, 
so that 


= y(X) = Crys + — (2) + 


it is seen that no non-trivial solution y(x) of (1) is of class (Z?). This, 
however, contradicts the theorem proved in [2], p. 376, according to which 
every point A not a cluster point of the spectrum is an eigenvalue for some 
boundary condition. This completes the proof of the theorem. 


THE JOHNS HOPKINS UNIVERSITY. 


REFERENCES. 


[1] P. Hartman, “ On differential equations with non-oscillatory eigenfunctions,” Duke 
Mathematical Journal, vol. 15 (1948), pp. 697-709. 
and A. Wintner, “An oscillation theorem for continuous spectra,” Pro- 
ceedings of the National Academy of Sciences, vol. 33 (1947), pp. 376-379. 

[3] S. Wallach, “ On the location of spectra of differential equations,” American Journal 
of Mathematics, vol. 70 (1948), pp. 833-841. 

[4] H. Weyl, “Ueber gewéhnliche Differentialgleichungen mit Singularititen und 
die zugehérigen Entwicklungen willkiirlicher Funktionen,” Mathematische 
Annalen, vol. 68 (1910), pp. 220-269. 

[5] A. Wintner, “The adiabatic linear oscillator,” American Journal of Mathematics, 

vol. 68 (1946), pp. 385-397. 

» “On the location of continuous spectra,” American Journal of Mathematics, 

vol. 70 (1948), pp. 22-30. 


[2] 


[6] - 


TWIN CONVERGENCE REGIONS FOR CONTINUED FRACTIONS 
+ K(1/b,), IL* 


By W. J. Turon. 


1. Introduction. In a recent paper [2],' which shall here be referred 
to as TCR, the author studied twin convergence regions for continued frac- 


tions of the form 


(1.1) by + 1/b, + 1/b2 +° 


Two regions * B; and By are called twin convergence regions for continued 
fractions (1.1) if the two conditions 


(a) | Dn | = 0, 
(b) bone By, © By, n= 0, 


insure convergence of fraction (1.1) which satisfies these conditions. When- 
ever at least one of the regions B; or By, is bounded away from the origin 
condition (a) is implied by condition (b). 

Two regions By; and By are called best twin convergence regions for 
continued fractions (1.1) if they are twin convergence regions and if more- 
over there do not exist twin convergence regions B’; and B’, such that 
B’; B;, B’, > By, where at least one of the relations B’; B;, B’,=B, 
fails to hold. 

None of the convergence regions derived in TCR are best regions. In 
Section 2 of this paper a large class of best twin convergence regions is 
determined. These regions are improvements of the regions obtained in 
Theorem 6.1 of TCR. 

In Section 3 two examples are given which show that an analogous 
improvement of Van Vleck’s Criterion and of Theorem 7.3 of TCR is not 
possible. 

We conclude this section by a remark on notation. Let B be a set and 
f(z) a single-valued function defined for all ze B, then f(B) is understood 


* Received December 1, 1947. Presented to the Américan Mathematical Society 
November 28, 1947. This paper contains the results of a research project undertaken 
by the author under a contract with the Office of Naval Research. 

1 The numbers in brackets refer to the bibliography at the end of this article. 

2 The term region is here used for open sets as well as for closures of such sets. 


112 


| 


NS 


TWIN CONVERGENCE REGIONS FOR CONTINUED FRACTIONS. 113 


to be the set of all points w =f(b) where be B. Thus, for example, —1/B 
is the set of all numbers —1/b, be B. 


2. A class of best twin convergence regions. Throughout this section 
we shall be concerned with pairs of regions By and By defined as follows: 


(2.1) z=r-e"e B, ifr=f(0), 0<m,< f(0) < m, 
z==r-e%e ifr=g(0), 0< ms < g(0) < m, 
where f{(@) and g(@) are of period 2z. 
It was proved in TCR that: 


A necessary condition for B; and Bg to be twin convergence regions for 
continued fractions of the form (1.1) is that 


f(0)9g(47— 0) = 4, for all 0. 
One is thus led to the consideration of regions satisfying the condition 
(2.2) f(0)g(7—0) —4. 
It was further shown in TCR that: 


If the regions By and By satisfy conditions (2.1) and (2.2) and if in 
addition the complements of both regions are convex, then the values of all 
terminating continued fractions of the form 


lie in the region B;/2 provided b.»¢ By and bon. € By for all n= 0. 


It is thus of interest to find all functions f(@) and g(6) which satisfy 
the assumptions of this theorem. This is done in the following lemma. 


LemMaA 2.1. Two regions B; and By satisfy conditions (2.1) and (2.2) 
and both have convex complements if and only if 


= fo-exp ( f ‘tan a(y)ay), 


9(0) =4/fo-exp ( J ‘tan 


where (6) is periodic with period 2, is continuous and satisfies the two 
conditions 


= 
ed 

‘iD 
or 
re 
at 
By 
In 
is 
in 
us 
nd 
en 
ts. | 
8 
f 


114 W. J. THRON. 


| <4r—e 
—a($)|/|@—¢|S1, 04, 
for all 6 and all ¢. 


It is easily seen that, unless the curve defined by r=f(0) has a unique 
tangent line at every point, at least one of the two regions under considera- 
tion does not have a convex complement. Thus f’(@) exists for all 6. One 
also deduces from the convexity condition imposed on the regions that 
| f’(0)/f(@)| m2/m,. These results insure that 


6 
#(8) —foexp ( 


if the integral is taken in the sense of Lebesgue [3, p. 368]. One now defines 


a(6) = Arc tan f’(0)/f(@). 
It is then clear that 
| a(0)| < $r—e, for all 6. 


The function f(@) can therefore be written as 


6 
f(8) —fuexp ( f tan a(y)ay). 
Now let p= H($) be the equation of the tangent line to = curve r =f (6) 
at the point f(@)e. It is given by 
=cos «(6)f(@)/cos (6 —0+ «(6)), <6 < O—a47/2. 


The region bounded by and interior to the curve r = f(@) is therefore convex 
if and only if 


t($)/f(¢) 21 


for every 6 and every ¢ in the corresponding range. This condition can be 
written as 


cosa 
cos + a) 


0 
—exp ( [tan 6(y) —tan + a(8)) 


= sin (@(y) —y + 0—a(6)) 
% cos -cos (WY dy) = 1. 


= exp ( San a(w) dy — log cos (¢ — 6 + a) + log cos¢) 


The last inequality holds if and only if for every 6 


in 


an 


I 

F 
fo 
ve 
= 
(2 


a) 


TWIN CONVERGENCE REGIONS FOR CONTINUED FRACTIONS. 115 


(a(y) —a(6) + 0—y)/(6—y) 20 
holds for almost all y~0, 0—a—27/2<W<60—<«4 32/2. The above 


inequality is equivalent to 

(«(8) —a(v))/(@—y) S1. 
By going through the analogous argument for the function 9(6) = 4/f(— 6) 
one obtains the additional condition 


(a(0) —a(y))/(@—y) 2—1. 
The restrictions as to the range of w for these conditions can be omitted, 
since if these conditions are satisfied in “the small” they are certainly 
satisfied in “the large.” 
One now constructs a sequence {6,} which is everywhere dense on the 
real axis and which is such that 


| %(An) —%(Om)| / | for all 
A new function 8(@) is then introduced by the definition 


B(O%) = a(n), 
B(8) —lim 8 


Oni> 


This function is continuous and is equal to «(6) for almost all 6. But then 


0 
log f(€) —log fom {tan B(y) dy. 
Finally since 8(6) is continuous it follows that 


B(@) = Arctan d(log f(@) — log fy) = Arctan (f’(6)/f (6) ) 
= a(0) 


for all 6. This completes the proof of the necessity. The sufficiency is 
verified without difficulty. 
Since «(6) was found to be a continuous function all integrals appearing 
in the remainder of this section may be considered as Riemann integrals. 
We now restrict ourselves to the consideration of pairs of regions By 
and B, which in addition to conditions (2.1) and (2.2) satisfy 


f(0) =f, exp ( tan a(W)dw), where «(6) is continuous 


2.3 
aa, has period 2z and is such that 


| «(0)| << | 1—a, 


116 W. J. THRON. 


A pair of regions By; and By, which satisfies conditions (2.1), (2.2) and 
(2.3) will be called a regular pair of regions. The corresponding pair of 
functions f(@) and g(6) =4/f(*—6) will be called a regular pair of 
functions. 


LemMA 2.2. Let f(@) and g(@) be a regular pair of functions and let 
functions k;(b,z) and kg(b,z) be defined as follows 


w = k;(b,z) =b exp Cf “Can za(y) —tan a(y))dyp), 

w =ky(b,z) = exp ( 2a(r—wy) —tan a(r—y)) dy), 


then there exists a positive number uw and a region D, defined by ze D if 


| R(z)| <1/(1—7), where (a, 2), 
and 
| S(2)| <a 
such that: 
i) For every b the functions k; and k, are holomorphic functions of z 
for all z in D. 
(ii) The regions k;(B;,z) and ky(By, form a regular pair of regions 
for every z in D. 


(iii) For R(z) =0,zeD 
wek;(B;,z) if |w| = fo 
wetk,(By, z) if |w|=4/fo. 
(iv) z=1eD and k;(b,1) k,(b,1) 


Property (i) is an immediate consequence of a known theorem in the 
theory of functions [1, p. 172]. 
Since the tangent of an imaginary number is imaginary one has fof 
R(z) =0 
| kr|—fo|0|/fo for be Bs 
and 
ky =4|b|/(fog (arg b)) = 4/fo for be By, 


which proves statement (iii). Property (iv) is immediately verified by 
substitution. 
It remains to show that (ii) holds. To that end we consider ky as 4 


t 

b 

b 

re 

pe 

F¢ 

de 

ide 

res 

the 

are 
fu 

the 
the 

ver] 

thu: 


TWIN CONVERGENCE REGIONS FOR CONTINUED FRACTIONS. 117 


function mapping the b-plane into the w-plane. The variable z plays the role 
of a parameter. The Jacobian of the mapping is 


J =0(|w|,argw)/0(| 6], argb) 


argb 
= (1+ (tan za(arg b))) exp ( f [#(tan za(y)) —tan «(p) 
Now 
(tan (a -+ iy)) = tanh y/(cos? + sin? tanh? y). 


Due to the restrictions imposed on «(6) cos? za(0) > h > 0 for all z satisfying 
| R(z)| <<1/(1— 7). One hence can find a > 0 such that for ze D(m), 
that is for | §(z)| < and | R(z)| < and for all b 40, J 0. 
The transformation k; has therefore a single valued and differentiable inverse 
for all of these points. The same is true for the transformation k, Hence 
the boundaries of the regions k;(B;,z) and kg(By,z) are the images of the 
boundaries of the regions By and By, respectively. For z real, ze D(p,) the 
boundaries of the image regions are given by the equations 


r= | ky(f()e,z)| and r= | 2)|, 


respectively. It is easily verified that these two functions form a regular 
pair. This remark completes the proof of (ii) for real z. 


Now consider the mapping 
We = kh; 2), 2 + Az), w, 0. 


For every z in D(,) there exists a 8 such that for | Az | <8 the mapping 
defined by this function differs by a given arbitrary small amount from the 
identity mapping with respect to displacement of points as well as with 
respect to change in directions. This is seen to be a consequence of the fact 
that for z in D(p,) the functions 


| ky 


/d|b|, /dargb, Oarghk;/0|b|, dargk;/dargb, 


are continuous functions of z and that the first and the fourth of these 
functions do not vanish for z in D(m). 

It follows that a » > 0, w S pw, can be found such that for all z satisfying 
the two conditions 


<a 
the regions k;(B;,z) and k,(B,,z) have both convex complements. One now 
Verifies easily that these regions also satisfy conditions (2.1) and (2.2) and 
thus completes the proof of (ii) and the lemma. 


| 


118 W. J. THRON. 


Now let B; and B, be a regular pair of regions and let the elements of 
the continued fraction (1.1) satisfy the conditions 


Don B,, B,, nN 0. 
Consider the continued fraction 
(2.4) (Dos 2) + 2) + 1/ky (bay 2) 


The approximants of this continued fraction are holomorphic functions of z 
forzin D. This is a consequence of Lemma 2. 2, (i) and Lemma 2. 2 of TCR. 

For no value of z in D does any approximant of (2.4) take on a value 
on the open line segment from 0 to ifo/2. As was stated in the beginning 
of this section the values of the approximants of (2.4) le in the region 
k;(B;,2)/2 for ze D (since k;(By,z) and ky(By,z) form a regular pair of 
regions) and for no z in D does the corresponding value region contain a 
point of the line segment in question. 

It follows that for z in D the approximants of (2.4) form a normal 
family of holomorphic functions. 

It was shown by Pringsheim that the continued fraction (1.1) con- 
verges if all its elements satisfy the condition | bn| 22. By a simple 
transformation one finds that the two conditions 


| Don | fos | | = 4/fos 0 


insure convergence of the continued fraction (1.1). One now notes that 
for R(z) =0, ze D the elements of the continued fraction (2.4) satisfy 


these conditions. 
An application of the Stieltjes-Vitali Theorem then leads to the con- 


clusion that the continued fraction (2.4) converges for all z in D. But 
z—=1eD and hence the continued fraction (1.1) converges. This completes 


the proof of the following theorem. 


THEOREM 2.1. Let a(6) be a continuous function of period 2x which 


satisfies the two conditions 
| | < 


| «(6) 64d, a>0, 


Let 


f(6) =f. exp a(w) dy, fo> 0. 


W 


F 


dit 


t 

( 
(: 
bn 


TWIN CONVERGENCE REGIONS FOR CONTINUED FRACTIONS. 119 


Then the regions B; and B, defined by: 
B; if r= 
r-e%eB, if r24/f(r— 
are best twin convergence regions for continued fractions (1.1). 
The following results are corollaries of this theorem. 


CoroLuary 2.1. Let a and c be real numbers satisfying the relation 
a>c=0 and let y be an arbitrary real number. Then the continued 
fraction (1.1) converges if for all n=O 


| Don — | = a 
and 
| Donss — (a? — | = 4a/(a? — c?). 


CorotLary 2.2. Let c be an arbitrary real number. Then the con- 
tinued fraction (1.1) converges if for alln=0 


| bn = 


3. Two examples. Let A,/B, be the n-th approximant of the con- 
tinued fraction (1.1) then 


(3. 1) An/Bn by (— 
k=1 


where By = 1, B, = b, and for n= 2 


By bn Bn-1 Bn-2. 


For n = 2 one thus has 
(3. 2) by (Bn By-2) / Bn-1- 


EXAMPLE 3.1. The continued fraction of the form (1.1) with 
by — b, 
bn = 2 sin 4r(1/(n + 3) + 1/(n-+ 2)) exp 1/(n + 3) (n+ 2)), 


n= 2, 
diverges. 


Let 
dn +3). 


‘ 


120 W. J. THRON. 


Employing formula (3.2) one easily verifies that the continued fraction 
with B, =e’, n> 1, is the one defined in the example. The divergence 
of this continued fraction follows immediately from relation (3.1). One 
now notes that for the elements }, of the example > | bn | = ©. 

Van Vleck’s convergence criterion states that the continued fraction 
(1.1) converges if (a) } |b, |= and if 


(b) bd, < —e, e>0. 


It follows from Example 3.1 that the e« cannot be omitted in Van Vleck’s 
criterion. 

The following example shows that in Theorem 7.3 of TCR the quantity 
5, used there, cannot be set equal to zero. 


EXAMPLE 3.2. The continued fraction of the form (1.1) with bb) =1, 


Don-1 a= 2 1(4 
Don 1(2"sp), n = 
where 8 = 0, Sn = 2(Sn1+ 2°"), n= 1, diverges. 


With the help of formula (3.2) one shows that the continued fraction 
of the form (1.1) whose B, are defined by 


Bon = (i/2)*, 
Bona 
has as elements the numbers given in the example. Further 


| BonBon-1 | 3%8,2-" <4: 37% 
since 
n-1 
0 < Sn2-" 4i-n 4-* < 4/3. 
k=0 
Hence the continued fraction diverges. 


WASHINGTON UNIVERSITY. 


BIBLIOGRAPHY. 


1. L. Bieberbach, Lehrbuch der Funktionentheorie, 4th edition, vol. 1, Leipzig, 1934. 
2. W. J. Thron, “Twin convergence regions for continued fractions b, + K (1/b,),’ 
American Journal of Mathematics, vol. 66 (1944), pp. 428-438. 

3. E. C. Titchmarsh, The Theory of Functions, 2nd edition, Oxford, 1939. 


? 


] 
t 
n 
( 
| 
( 
tt 
is 
in 
tl 
tl 
tr 
al 
in 
eo 
al 
ab 


STRUCTURE OF GENETIC ALGEBRAS.* 


By R. D. ScHAFER. 


I. M. H. Etherington has studied the non-associative algebras which 
arise in the symbolism of genetics (references [5] through [10]). In these 
he defines a class of algebras called train algebras, and proves in [7] that this 
class includes algebras called special train algebras which are defined by their 
structure rather than by any type of recurrence equation. 

From the algebraic point of view the concept of train algebra appears 
to be too inclusive, in that an analysis of the structure of train algebras 
seems feasible only when the rank of the algebra is small. However, from the 
point of view of genetics the concept of special train algebra is certainly too 
narrow. For, although the gametic algebras for the fundamental types of 
symmetrical inheritance are special train algebras, the corresponding zygotic 
(copular, etc.) algebras are not necessarily special train algebras [7, p. 6, 
footnote |. 

We introduce a concept of genetic algebra which is intermediate between 
(commutative) train algebra and special train algebra. The definition is 
more satisfactory than that of special train algebra on two counts: the struc- 
ture of the algebra is not postulated, and the duplicate of a genetic algebra 
is a genetic algebra. It follows from this latter fact that our genetic algebras 
include, not only the fundamental symmetrical gametic algebras, but also 
the zygotic (copular, etc.) algebras obtained from them by duplication. On 
the other hand, this new concept is restrictive enough for us to deduce a 
transparent structure theory for genetic algebras. 

It is only fair perhaps to caution the reader that our interest in these 
algebras is entirely in the algebraic formalism, and that we can give no 
indication beyond Etherington’s own remarks in [5] and [8] of their possible 
contribution to the study of genetics. Also we use the name “ genetic 
algebra” with some misgivings. Our results are applicable to the algebras 
arising in genetics where inheritance is symmetrical in the sexes, and we 
abbreviate “ genetic algebra of symmetrical inheritance ” to “ genetic algebra.” 


1, Preliminaries. The principal tool of our investigation of genetic 
algebras is the transformation algebra [1, $2]. Let % be a non-associative 


* Received December 5, 1947. 


121 


122 R. D. SCHAFER. 


algebra of order n over a field 3. Then for a fixed element 2 in % the 


correspondences 


a—>ar=ak,, for all a in Y, 


are linear transformations on 9% called the right and left multiplications R, 
and LD, respectively. If Mt is a subset of the total matric algebra (%)» of 
all linear transformations on %, the enveloping algebra of Wt is the algebra 
of all polynomials in the transformations in Yt with coefficients in §. The 
enveloping algebra of the set which consists of the identity J in (%),, 
together with the right and left multiplications of Mf, is the transformation 
algebra T(M) of MW. Clearly any T in T(%) may be written in the form 


(1) T = al + f(Rz,, Loy a in %, in 


If B is any linear subspace of M, the enveloping algebra of the set of 
right and left multiplications of 9% which correspond to elements in 8 is 
denoted by B*. That is, JT in B* has the form (1) with «—0, 2; in %. 
(It should be noted that the transformations in 8* are linear transformations 
on %, although for compactness the notation does not indicate this.) 


~ 


A homomorphism H of an algebra % over § into an algebra © over § 


is a linear mapping of & into © such that 
(2) (ax) H =aH ozH for all a,x in %, 


where © denotes multiplication in ©. The kernel of H is the set B of all b 
in % such that bH = 0; % is an ideal of Y. The homomorphism is onto € 
in case, for any c in ©, there exists some a in %& such that c—aH. The 
relationships between homomorphisms, ideals, and difference algebras are well- 
known. In case the homomorphism in question is from % into the base field 


%, we use a functional notation: 


(3) wo: t>0(2), x in M, w(x) in %, 


and (2) becomes 
(4) o(ar) =o(a)o(z), for all a, x in &. 


We call an algebra % nilpotent in case there exists an integer ¢ such 
that every product of ¢ elements in %, no matter how associated, is zero. 
This is what Albert has recently called strongly nilpotent [2, p. 528; 3, p. 549]. 
He has defined a nilpotent algebra in the following way: every sequence 
-,a, of k elements of % defines a special product a of order k by 


( 

0 

0 

[ 

th 
Xo 

on 

we 
an 
is 
(5 
Th 

bar 

fas] 
(a 
gan 

[5, 

indj 
(6) 


STRUCTURE OF GENETIC ALGEBRAS. 123 


either of the formulas a“) = aa; ora for > 1; if all special 
products of order & are zero and some special product of order & —1 is not 
zero, Albert calls % nilpotent of index k. Certainly a strongly nilpotent 
algebra is nilpotent by these definitions. However, an observation of 
Etherington [7%, p. 2] shows the equivalence of the two notions: for if every 
special product of index & in %& is zero, then every product of ¢=2*+ 
elements in %, no matter how associated, is zero. Thus the concept of a 
strongly nilpotent algebra is redundant. 

A necessary and sufficient condition that an ideal 8 of a non-associative 
algebra 2 be nilpotent is that the associative algebra 8* be nilpotent [2, 
Lemma 5]. 

If a non-associative algebra 9{ is homomorphic to a semi-simple algebra 
(direct sum of simple algebras), there is an ideal # of Mf, called the radical 
of %, such that 9{— 9 is semi-simple and ®t is contained in every ideal B 
of % such that 9[—%8 is semi-simple. It is an immediate consequence of 
[2. Theorem 6] that any nilpotent ideal of 9 is contained in the radical of 2. 


2. Baric algebras. A non-associative algebra %f of order n over a field 
vy is called baric in case it has a non-trivial representation of degree one— 
that is, in case there is a homomorphism (3) of % into § such that for some 
in we have w(2,) +0. It follows that is a homomorphism of 
onto % since, for any 2 in %, we have w(at)/w(ao)) =a. We call w(x) the 
weight of x, and w the weight function of YF. 

We denote the kernel of the homomorphism » by 9%. Then a necessary 
and sufficient condition that a non-associative algebra % be a baric algebra 
is that % contain an ideal 9 such that 


or 


N= s. 


Thus any non-associative algebra Jt of order n—1 over % gives rise to a 
baric algebra & of order n over % if we adjoin an element wu to 9 in any 
fashion such that the elements u®— u, uz, and zu are in ® for all z in N 
(a trivial construction). 


In a gametic algebra M we take a basis w,,Ws,* + *,Un denoting the 
gametic types involved in some genetical situation of symmetrical inheritance 
[5.§6]. If yijx, is the probability that an arbitrary gamete produced by an 
individual of zygotic type uiw; (= ujui) be of gametic type ux, we have 


124 R. D. SCHAFER. 


subject to the conditions 
(7) = 1 


Etherington points out that equations (6) and (7)—not assuming com- 
mutativity—imply that % is a baric algebra with weight function 

(8) @: L = (2) == & in 

The converse is also true: given any baric algebra & there is a_ basis 
Ui, U2,* * *, Un Of YM whose multiplication table (6) is subject to the conditions 
(7). Also the defining homomorphism o has the form (8). For let v2,- ++, vp 
be a basis of the ideal 9 of 9. There exists an element wu in YW, but not in N, 


of weight 1, so that has basis wu, Write 
(t=—=2,---,n). Then 

(9) =1 (t= 1,---,n). 
Moreover, has the basis +, Un satisfying (6) for some yijx in §. 


Now (4) and (9) imply that 1—o(uj)o(uj) = o(uiuj) = Sxyijno (ux) 
= xyijx So that (7) holds. Also (8) follows from (9). 

If & is a baric algebra, then T(%) is also a baric algebra. For if % 
has weight function , a weight function 6 for T(%) is defined by 


(10) =o(uT), 


where wu is any element of weight 1 in %&. That @ is well-defined by (10) 
is clear since, if T in T() is written in the form (1), we have equivalently 


Then 6 is linear by (10), a homomorphism by (11), and non-trivial since 
—1. 


3. Genetic algebras. Etherington has investigated the non-commutative 
aspects of some of the concepts he has introduced. However, since any algebra 
encountered in genetics may be taken to be commutative [8, p. 26], we shall 
assume the commutative law in all that follows. 

In a commutative algebra 9 we have L, — R, for all x, so that we may 
write T in T(2) in the form 


(12) a in %, 2; in 


The characteristic function | AI —T | of T in (12) has coefficients which are 
polynomials in a and the coordinates of the x;, polynomials which depend 


of 


) 
| 
( 
C 
( 
( 
( 
f 
| 
is 
( 
fo 
= 
of 
def 
(1 
for 
(16 
If 
coo 


re 


nd 


STRUCTURE OF GENETIC ALGEBRAS. 125 


both on the function f and on elements of % which are independent of T 
(that is, scalars completely determined by %). 

We call a commutative baric algebra % over % with weight function 
a genetic algebra in case the coefficients of the characteristic function of T 
in (12), insofar as they depend on the 2;, depend only on the weights w(z;). 
That is, these coefficients are polynomials in @ and the w(a;) having coeffi- 
cients which involve certain elements of % determined by % in combinations 
determined by f. (Note: the fact that for a given 7 in T(M) the expression 
(12) is not unique has no bearing on our definition. ) 

This definition is an extension of Etherington’s definition of train algebra 
[5, $4]. Define the right powers of x in by 21 and 
(13) (k = 2,3,---). 
(Since we assume 2 commutative, right powers and similarly defined left 
powers are equal.) Then a commutative baric algebra YM over % with weight 
function » is called a train algebra in case the coefficients of the (right) rank 


equation [4,§ 19], insofar as they depend on z, depend only on w(x). That 
is, there exist elements 8,,- - -,Br-, in % such that 

(14) a” +. +--+ 4+ = 0 

for all 2 in &, where x* is the right power (13). 


THEOREM 1. A genetic algebra MX over % is a train algebra. 


Let T= R, in (12), and write w(x) =é&. Then, since the coefficients 
of the characteristic function 


$(A) = | — RP, | 


of R, are homogeneous polynomials in the coordinates of 2, we have, by the 
definition of a genetic algebra, 


for some y1,‘* yn in §. Now (15) factors in a finite extension & of § as 


(16) (A) = (A— (A — Avg) (A — Arg), Ai in R. 
If 


is the rank function of %, ¥; a homogeneous polynomial of degree 7 in the 
coordinates of 2, then (17) divides Ad(A) [4,$19]. The A; in (16) may 


| 
is 
18 
Un 
Vi 
:) 
| 
) 
ly 
all 


126 R. D. SCHAFER. 


then be ordered so that (17) equals A(A —Axé) (A—Ar4+€), from which 
it follows that 


The rank equation is (14) with 

Bj = (—1)43Ai,* jm 
in %; % is a train algebra. 


In a non-associative algebra in which powers of a single element are 
not necessarily associative, the concept of a nilpotent element may be defined 
variously. Here we shall call an element z nilpotent in case there exists an 
integer k for which the right power z*—0. In a train algebra % the 
kernel 9 of the weight function » then has an easy characterization: Jt con- 
sists of the nilpotent element of 9. For z* = 0 implies w(2*) = [w(z) ]* =0, 
o(z) =0; conversely w(z) =O implies 2 —0O by (14). It follows from 
Theorem 1 that the same characterization of 9t holds for genetic algebras. 

We construct an example of a train algebra which is not a genetic algebra 
as follows: let % have characteristic two.t. Then the square of any element 
z in the commutative algebra Jt = (v1, v2, v3) with multiplication table 


U1V2 = Vs, V2V3 = 11, = Vo, v;7=0 (4 = 1, 2,3) 


is zero. However, % is not a nilpotent algebra, since Jt = MM? (= MM). 
Let 2% be the algebra obtained by adjoining a unity element 1 to %. Then « 
in % has the form x = £1 + z, and % is a train algebra since r—> & is a weight 
function for while 2? = (#— £1)? =2?+ £1 —0 implies 2° + 0. 
Since 9 is not nilpotent, it follows from Theorem 4 below that %f is not a 
genetic algebra. This example also shows that a structure theory as elemen- 
tary as that in 5 below is not possible for train algebras. 

A commutative baric algebra % with weight function o is called a special 


train algebra in case 


(a) the kernel 9 of w is nilpotent, and 
(b) the subalgebras N* of defined inductively by Nt = MN, WN 
for k = 2,3,- are ideals of 


1Tf we knew an example, over a more or less arbitrary field, of a commutative 
non-nilpotent algebra 9, all of whose elements are nilpotent, we could give a more 
satisfying example of a train algebra which is not a genetic algebra by adjoining 1 to 
Qt. There are many examples in the literature of non-commutative algebras St with 
these properties but, although it seems possible that commutative examples exist, 
we have not been able to construct one. 


- 


ex 


| 
| 
| 


STRUCTURE OF GENETIC ALGEBRAS. 


THEOREM 2. A special train algebra X over & is a genetic algebra. 


Etherington has shown in [7,§ 4] that over a finite extension R of § 
there exists a basis of together with scalars A; =1,As.,° in &, 


such that the matrix of R, for z in M has the form 


* * >) 
0 
(18) 
L 0 0 slide EXn J 


Then the characteristic function (15) of R, has the form (16) with dA; as 
in (18). Hence 


(19) (—1) "yt = (¢ =1, 2,- 


and the y¢ in % are dependent, not on x, but only on the algebra Mf. From 
(18) we obtain 


0 0 f(&An; 


where & —o(2;,). Then 7 in (12) has characteristic equation 
| AI —T | = |(A— I —f *)| 

= [(A—2) —mJ[(A—2) -[(A—%) — =0 
where we have written 


Then 
| AL | = (A— 2)" 4 pa (A— pn 


where (—1)8us is the elementary symmetric function of degree s in the yi. 
But then by (20) the ws are polynomials in &, &,° -, 
with coefficients which are symmetric functions of the Ay. These coefficients 
are expressible in terms of the elementary symmetric functions (19) of the Aj, 
and therefore are not dependent on the 2;. Hence % is a genetic algebra. 

The copular algebra of simple mendelian inheritance (6 below) is an 
example of a genetic algebra which is not a special train algebra. 


; 

| 


128 R. D. SCHAFER. 


4. Duplicate of a commutative algebra. Let % be a commutative 
algebra of order n over %, and let u,, +, Un be a basis of with multi- 
plication table (6). The duplicate ’ of % is defined as the commutative 
algebra of order +1) over with basal elements vi; (¢ Sj; t, = 1, 2, 

satisfying 


(21) Vijlrs = Vet (1 = ‘= i, fs r, 8, k, ? n) 


where we identify vs; —=vxe for t>k. That is, element of 2’ behave like 
quadratic forms in 9%. The definition of 9’ is independent of the basis chosen 
for since = %, implies = Y,’ [9, Theorem IV]. 

The process of duplication is important in genetics because we obtain 
from any gametic algebra % a corresponding zygotic algebra X’ whose basis 
consists of the zygotic types uju; (—ujui) obtained from the gametic types 
Uy, U2,* * *,Un in WY. Multiplication in the zygotic algebra %’ is carried out 
as though it were being performed in % according to the multiplication table 
(6). Writing vi; for ujuj (17) we obtain (21), where the coefficient of 
vet is the probability that an individual of zygotic type uiuj; mating with one 
of type urus will produce an individual of zygotic type uur. If is the 
weight function (8) of the gametic algebra %, then 


is a weight function for the zygotic algebra 9’; 9’ is a baric algebra. Genetical 
calculations involving the first filial generation may be performed in Y, 
those involving the second filial generation in the copular algebra MX” (the 
duplicate of 1’), ete. 

We return to the general notion of a duplicate algebra as defined by (21). 
There is a homomorphism H of %’ into % defined by 


(22) A: vig = = t,j—1,- --,n). 


Actually H is a homomorphism of 9’ onto the ideal 9? (= MM) of , since 
the uju; span %?. We denote the kernel of H by ©. It is easy to see that 


(23) = 0 


[9, Theorem II (ii)]; that is, D consists of absolute divisors of zero. 
Let a be an element of 9’. We denote the corresponding right multt- 
plication of %’ by R*,. Then any element 7: of T(%M’) has the form 


(24) T.= als +- f(R*a, a in %, a; in 


where J. is the identity on ®’. 


V 
( 
C 
( 
( 
al 
(2 
T 
an 
Eq 


t 


STRUCTURE OF GENETIC ALGEBRAS. 129 


LemMA. Let a commutative algebra % of order n over § have duplicate 
W’, and Ts in T(M’) have the form (24). Then the characteristic function 
of T+ 1s 
(25) | 7's 


= (A— a) | | 


where T in T(M) has the form (12) with 2; =aiH, and H is the homo- 
morphism (22) of into 

Let m be the order over % of the kernel D of H. Then Y’—=O+9D 
for a linear subspace D of %’ having order p=4n(n+1)—~m over §. 
It follows from (23) that, corresponding to this way of writing ’, the right 
multiplication R*, has matrix 


where Ma and Ny are pXm and p Xp matrices respectively. Since 
Y’ — = M° under the natural correspondence determined by H, we have 
Na = Ran, the right multiplication of 9° corresponding to aH in Y?. Then 
R*, has matrix 

(26) ( = ), 2 —=aH in 

R°, 

Now % = %* + G for a linear subspace G of 2 having order n— p over §. 
Corresponding to this way of writing %, the matrix of R, for z in A? C W is 


0 
since 9? is an ideal of 2. It follows from (26) and (27) that 

and 


Then, denoting by J, the (p-rowed) identity on %? and D, equations (28) 
and (29) imply that | AZ-— 7. | = (A— | (A— Ip — f +) | 
= (A— | —T |, for T+ in (24) and T in (12) with 
Equation (25) follows immediately since m—n+ p=—n-+ $n(n+1) 
= 3n(n—1). 


We use this lemma in the proof of 


130 R. D. SCHAFER. 


THEOREM 3. The duplicate X’ of a genetic algebra XM over & is itself 
a genetic algebra. 


By definition 2’ is commutative. If is the weight function of 9M, then 
a weight function o’ of YY is defined by 


(30) ain W’, all in 


where H is the homomorphism (22) of 2’ into Mf. It follows from (25) and 
the fact that & is a genetic algebra that the characteristic function of Ts in 
(24) has coefficients which, insofar as they depend on the a;, depend only 
on the o(a;) =o(aiH) =o'(a;); YW’ is a genetic algebra. 

The advantage of the concept of genetic algebra over special train algebra 
lies in Theorem 3. The most elementary algebra of genetics, the gametic 
algebra of simple mendelian inheritance (6 below), is a special train algebra 
(therefore a genetic algebra by Theorem 2). Hence by Theorem 8 all algebras 
obtained from it by duplication are also genetic algebras. However, the 
copular algebra of simple mendelian inheritance, obtained by duplicating 
the gametic algebra twice, is not a special train algebra. 


5. Structure of genetic algebras. It follows from (5) that the radical 
of any baric algebra is contained in the kernel 9t of the weight function »o. 
We shall show that for a genetic algebra 9% the radical is %, by showing that 
MN, which we already know consists of the nilpotent elements of %, is actually 
nilpotent. 


TuHeEorEM 4. Let M be the kernel of the weight function w of a genetic 
algebra M over %. Then M ts the radical of XM, and is nilpotent. 


Let T in T(2) have the form (12), and write o(a;) =&. Then the 
characteristic function 


| AT —T | +- 


of T has coefficients y; which are polynomials in 2, &,, ,- - -, with constant 
terms For let and 4 in (12); then 

T = 0, | AT | =A" =A" + + Yno, OF = 0 for j = 1, 
It follows that A” is the characteristic function of bad T in T(M) which may 
be written in the form (12) with 


(31) = 0, 


In this case 7" —0, T is nilpotent. Now let T be in the enveloping algebra 


( 


1¢ 


ra 


STRUCTURE OF GENETIC ALGEBRAS. 131 


m* of the right multiplications corresponding to elements of Jt. Then (31) 
is satisfied, 7 is nilpotent. Since Jt* is an associative algebra consisting of 
nilpotent elements, Jt* is nilpotent. Thus 9 is a nilpotent ideal of YW, and 
is contained in the radical # of 2%. On the other hand, (5) implies that 
contains Rt, or Me. 

The classical elements of a structure theory for a linear algebra % are 


(i) the nature of the radical 9, and 


(ii) the nature of the simple components of the semi-simple algebra 
— RN. 

For genetic algebras, (i) is answered by Theorem 4. Question (ii) is trivial 
by virtue of (5). 

One may also ask whether or not the analogue of the so-called Wedderburn 
Principal Theorem holds: does % contain a subalgebra S = 2% —M, so that 
Y= S+MN? It is easy to see from (5) that this question is equivalent to 
the following one: does % contain an idempotent element e? One may 
readily construct an example of a genetic algebra without an idempotent, 
so the answer in general is negative. 

However, the existence of an idempotent is significant genetically, since 
it represents a population in equilibrium for random mating [6, p. 138]. 
Ktherington gives conditions for the existence of an idempotent in a commu- 
tative baric algebra [6, Theorem VI]. Clearly a genetic algebra (or even a 
train algebra) contains an idempotent e if and only if there is an associative 
subalgebra © of 2 which is not contained in 9M. For e generates such an 
algebra ©, while the converse follows from the fact that any non-nilpotent 
associative algebra € contains an idempotent. 


6. Simple mendelian inheritance. Let % have characteristic not two. 
The gametic algebra G of simple mendelian inheritance is the commutative 
algebra @ = (uw, U2) of order 2 over % with gametic multiplication table 


(32) U,? == U1, = Fu, + Fur, = Us. 

An easy change of basis gives 6 = (u,z) with 

(33) 2=0. 

Writing éu-+ yz, we have weight function o: =€; then 


N= (z), N°=0, G is a special train algebra. The transformation algebra 


T(G) has order 3 over $, and any element 7 of 7(G) may be written in the 
form 


(34) T = al + 2R,, ain }, v—éu+ in 


If 

id 
y 
ra 
ic 
ra 
aS 
al 
at 
y 
t 
| 

| 

| 

| 


132 R. D. SCHAFER. 


~ 


The characteristic function of T in (34) is 
(2a + 3E)A + (a + + 2) = [A— (a + + 


The zygotic algebra 3 of simple mendelian inheritance is obtained from 
® by duplication. For purposes of computation in genetics one would want 
the multiplication table obtained by duplicating (32). However, the structure 
of 8 = W’ is seen more easily when the multiplication table (33) is duplicated, 
Write a= uu, b= uz, c=2z. Then 8 = (a,b,c) with 


a*=a, ab=3b, ac—be—c?=0. 


Writing x = éa+ nb + fc, we have the weight function o: r—>o(r) =é 
The kernel of is (b,c). Then (c), R°—O0, 8 is a special 
train algebra. The transformation algebra 7(8) has order 6 over § and 
any element 7 of T7(3) may be written in the form 


(37) T= al + 2R,,+ &a + nib, w(x; ) a= 


The characteristic function of 7 in (37) is 
(38) [A— a] [A— (a + 24, + 4683) — + & + ]. 


Duplication of 8 gives the copular algebra © of simple mendelian 
inheritance. As before we omit the multiplication table (important from 
the point of view of genetics, but not structurally) which is obtained by 
duplicating (32) twice. Instead we write v = aa, p; =ab, p.= bb, ps =a, 
ps=be, ps=cc, for a,b,c-in (36). Then © = (v, ++, ps) with 


9 


= UPi = Pipi = 9 (t=1,--+,5; 7 =3,4,5). 


Writing év + Smipi, we have w(x) The kernel of is N= (py 
-+ +595), and N? = (ps, ps, ps), = (ps, ps), N*—0. Now is not a 
special train algebra since €N? contains vp2 = }ps, which is not in M?; N? is 
not an ideal of ©. That € is a genetic algebra is guaranteed by Theorem 3. 


7. Jordan algebras. A commutative algebra % of order n over § 38 
called a Jordan algebra in case 


x? (ry) = for all x,y in &. 


Let % have characteristic not two; any linear subspace Mt of ($)m which 
is closed with respect to “ quasi-multiplication ” 


( 
al 
tr 
( 
By 

eq 
al 
bas 
M 
alg 
alge 
sim 
that 
ove 
alge 
of pr 
in ( 
are (j 


STRUCTURE OF GENETIC ALGEBRAS. 133 


(40) x,y in M, 


where 2: y denotes the associative multiplication of transformations in M, 
is a Jordan algebra of linear transformations of order nm? over § 
[3, p. 546]. 

The gametic algebra © of simple mendelian inheritance is the case 


n= 2 of the genetic algebra Gn = +, 2n) with multiplication table 
(41) u? =U, Uz; = 42;, 242; =0 (t,j =m 


In [6, $6] Etherington has shown that any train algebra of rank 2 and order 
n over % of characteristic not two is equivalent to G,, and also has indicated 
that these algebras are Jordan algebras [6, p. 138, footnote]. Actually they 
are Jordan algebras of linear transformations. For let e;; (i,j =1,- + -,n) 
be the usual matric basis of (%)n with matrix multiplication 


(42) Crt = Ojxlit (Kronecker delta), 


and let Mt = (11, @12,° @in)- By (40) M is a Jordan algebra of linear 
transformations with multiplication = $(8i1¢1; + or 


(43) €117 = 11, €11€15 = $61), = 0 (1,] = * 


By (41) and (43) the correspondence u— (1=2,° +, is an 
equivalence between ©, and Mt. 

The zygotic algebra 8 of simple mendelian inheritance is also a Jordan 
algebra of linear transformations. For let Mt be the subspace of (%)s3 with 
basal elements = @22, 0 = + @23, C= Defining multiplication in 
M by (40) and (42), we obtain the multiplication table (36) ; Mt is a Jordan 
algebra of linear transformations and is equivalent to 3. 

Inasmuch as powers of a single element are associative in a Jordan 
algebra over % of characteristic not two [3, §5], the copular algebra € of 
simple mendelian inheritance is not a Jordan algebra. For we see from (39) 
that the right power p,* 0, while p,?p,? = ~ 0. 

These considerations lead us to an analysis of those’ genetic algebras Ff 
over a field % of characteristic not two which are at the same time Jordan 
algebras. Let wu be an element of weight 1 in 9. Then, by the associativity 
of powers in 9, w generates an associative subalgebra © which is not contained 
in N; there is an idempotent e in Y%. 

Albert has shown in [3] that the only possible characteristic roots of Re 
are 0, 3, 1, so that the equation 


134 R. D. SCHAFER. 


(44) ve = Br, B in 
has solutions only for 8B =0,4,1. Writing %.(8) for the set of all x in ¥ 
satisfying (44), one obtains 9% as the supplementary sum 

(45) = Me(1) + + We(0). 


Let tg be the dimension of the space %-(8) over %. Then ¢;=1 and 
t, +t; -+t)—n. Corresponding to a basis of % in the form (45), we have 


the matrix of Re in the diagonal form 
(46) = diag $13, 0} 
where Ig is the ¢g-rowed identity matrix. 
Since 7 (QM) is a baric algebra with weight function 6 defined by (10), or 
(47) O(T) =a+ +), = o(2%), 


for T in (12), we know that one characteristic root of T is 6(T). For 
=|AT—T| implies that ¢(7) =0 and 6[¢(T)] —¢[0(T)] 
6(T) is a root of ¢(A) =0. However, for Jordan algebras we can obtain the 


following stronger result. 


THEOREM 5. Let & have characteristic not two, and M be a genetic 
algebra which is a Jordan algebra over &. Then & contains an idempotent e, 
and the distinct characteristic roots of T in T(%M) in (12) are at most three: 


&, 6(T), a+ ° 


where & =w(2a;), and 6 is the weight function (10) of T(M). The multi- 
plicities of these roots are the orders over & of We(0), Me(1), Me(4) 


respectively. 

For it follows from (46) that 
Then, since % is a genetic algebra and w(z;) = & = w(&e), we have 
| L—T | = |(A—2)I —f (Ro, +)| = |(A—4) 1 —f (Rees Rese,’ 
Then (48) and (47) imply that the characteristic function of T is 
(49) [A— a] [A — 6(T) [A — {a + f (BE, 


The theorem follows. 


STRUCTURE OF GENETIC ALGEBRAS. 135 


In the examples we have worked out for simple mendelian inheritance, 


(35) and (38) illustrate the characteristic function (49). 


THE INSTITUTE FOR ADVANCED STUDY. 


REFERENCES. 

1, A. A. Albert, “ Non-associative algebras I. Fundamental concepts and isotopy,” 
Annals of Mathematics, vol. 43 (1942), pp. 685-707. 

2, ———,, “On Jordan algebras of linear transformations,’ Transactions of the 
American Mathematical Society, vol. 59 (1946), pp. 524-555. 

3. ———, “A structure theory for Jordan algebras,” Annals of Mathematics, vol. 48 
(1947), pp. 546-567. 

4. L. E. Dickson, Linear algebras, Cambridge Tract, no. 16 (1914, reprinted 1930). 


. I. M. H. Etherington, “Genetic algebras,” Proceedings of the Royal Society of 
Edinburgh, vol. 59 (1939), pp. 242-258. 


6. » “Commutative train algebras of ranks 2 and 3,” Journal of the London 
Mathematical Society, vol. 15 (1940), pp. 136-149. 
7. ———, “Special train algebras,” Quarterly Journal of Mathematics (Oxford), vol. 


12 (1941), pp. 1-8. 


8. ———,, “ Non-associative algebra and the symbolism of genetics,” Proceedings of 
the Royal Society of Edinburgh, vol. 61, pt. B (1941), pp. 24-42. 
9, ———, “ Duplication of linear algebras,” Proceedings of the Edinburgh Mathe- 


10. 


matical Society (2), vol. 6 (1941), pp. 222-230. 
, “Corrigendum: commutative train algebras of ranks 2 and 3,” Journal of 
the London Mathematical Society, vol. 20 (1945), p. 238. 


i 
yr 
ic 
e, 


DIVISIBILITY PROPERTIES OF THE FOURIER COEFFICIENTS 
OF THE MODULAR INVARIANT j(r).* 


By Josern LEHNER. 


1. Introduction. Some years ago D. H. Lehmer [1]? gave a series of 
arithmetical properties of the Fourier coefficients of the modular invariant 


+ 744+ 196, 884e +---, x = exp 2zir, I(r) > 0. 


Here we derive some arithmetical properties of the coefficients of a quite 
different character. In fact we prove 


THEOREM 1. If 


then 

(1. 2) Csv = 0 (mod 25) 

(1. 3) C;7v = 0 (mod 7) 

(1. 4) Civ =0 (mod 11), y= 1,2,3,--- 


The general idea behind the proof of this theorem is as follows. Consider 
the congruence (1.2) as an example. By applying a certain linear operator— 
denoted by U;—to the right member of (1.1) one obtains the series with 
v replaced by 5». The same operation performed on the left member yields 
a modular function belonging to a subgroup of the full modular group. We 
shall try, therefore, to express that modular function in terms of the basic 
function of the subgroup, which is known to have integral coefficient Fourier 


expansions. The identity which results is of the form 
(1. 4)’ Us] = 25(a,8(7r) + a.8*(7) +: + + as@8(7)), 


where @ is the basic function, and the congruence (1.2) follows at once. 
Formula (1. 4)’ is an algebraic relation connecting two modular func- 

tions on the same subgroup—U;j and ®—i.e., it is a modular equation. 

Modular equations have been used by Rademacher [3], Watson [4], and the 


* Received February 4, 1948. 
1 Numbers in square brackets refer to the bibliography at the end of the paper. 


136 


] 
i 
( 
| 
tt 
|| 


THE FOURIER COEFFICIENTS OF THE MODULAR INVARIANT 137 


author [2] to discuss the famous congruences of Ramanujan concerning the 
partition function. Ramanujan himself proposed certain identities from 
which his congruences would follow as immediate consequences in the same 
way as (1.2) follows from (1.4)’. Later these identities were proved using 
the methods of modular function theory. In particular, the proof of Theorem 
1 as outlined above is modeled after Rademacher’s proof of the Ramanujan 
congruences [3]. 

In the work on the partition congruences an essential inelegance occurs 
because of the fact that the partition function is not quite the coefficient 
function of the Fourier expansion of any modular function. The theory of 
modular functions must therefore be distorted somewhat so that an application 
to the partition problem is possible. The resulting awkwardness is justified, 
of course, by the importance of the problem. The author feels, however, 
that from the standpoint of the theory of modular functions, the developments 
of this paper place the theory of modular identities in a simple and natural 
setting. 

The congruences (1. 2)-(1.4) can be extended to powers of their respec- 
tive primes. In 4-6 we prove 


THEOREM 2. With the cv of (1.1), we have 


(1.5) cv =0 (mod if y=0 (mod 5%) 
(1. 6) cv =0 (mod 7*) if (mod 7¢) 
(1. 7) cv =0 (mod 11") if v=0 (mod 11?), 
for v1, 2,3,---, and 2,3,: 


The difficulty which prevents ready extension of (1.7) to # > 2 will appear 
in due course (4). 

Note that no statement concerning ¢) is made in Theorems 1 and 2. 
Obviously any j(r) + const. has the same modular properties as j(r) itself, 
so that cy is quite arbitrary. 


2. Congruences for the moduli 5 and 7. Let us recall first that an 
entire modular function on the group T is an analytic function f(r), which is 


(a) regular in the upper half-plane J(r) > 0, 

(b) meromorphic with respect to an appropriate uniformizing variable 
when r+ assumes real rational values or when += 0, 

(c) invariant under I, i.e. f(Vr) =f(r) for every modular substi- 
tution Vr = (ar b)/(cr + d) belonging to I. 


138 JOSEPH LEHNER. 


We assume I to be a subgroup of finite index in the full modular group 
r(1). The subgroup of interest to us is Ty(p), defined by c=0 (mod p), 
p being a prime > 3. j(r7), of course, belongs to the full group. 

In order to get a hold on the coefficients of j(r) whose indices are 
divisible by p (p = 5, 7, 11 in the cases of interest) we introduce the operator 


[2] 
(2.1) Unf (2) SH 


First we note ([2], p. 499) that if f(7) =X az’, then 
=8 
(2. 2) U,f = t = — [— s/p], 
where, throughout this paper, 7 exp 2zir. For the left member is 


1 co —1 
a exp {2ai(r + A)v/p} = dy exp (2rtvr/p) Sexp (2rtdAv/p) 


A=0 
co . 
= ay exp = exp (27iwr). 


Thus in order to prove a property of the ay of f for which p |v we may 
prove the same property for all of the ay of Upf. 


Next we prove a 

Lemma. U,j is a modular function on To(p). 

This lemma may be established by the same reasoning employed in the 
proof of Theorem 4 of [2] (pp. 499-501). That U,j is regular in the upper 
half-plane is obvious. The subgroup T)(p) has only two parabolic vertices 
when p is prime, say, 710, 7 =0. From (2.2) we note that at rio 


we have the expansion 


(2.3) Upj (+) = Co + Cpt Copt® 

At 7’ =0 we write 7’ = —1/pr and consider rio. From (8.81) of [2] 
we then find 

(2. 3)’ pU pj = pUpj (pr) + 1/p*r) —j (7). 


Using j(—1/r) =j(r) and (2.2) we obtain 
(2.4) pUpj(r’) =const. 4+ - 


Thus U,j is regular at r—ioo and has a pole of order p? (im 


=exp 2mir) at = 0. 


W 
dk 
is 
me 
obt 
the 


THE FOURIER COEFFICIENTS OF THE MODULAR INVARIANT 139 


To establish the invariance of Upj on To(p) we make use of Lemma 1 
of [2], p. 501. We have 


p-1 
Upj (Vr) Tyr = +A)/p, VeTo(p), 
A=0 
Dj(WpTur), WueTo(p’), 


(Tur) = Upj(r). 


Since U,j is a modular function on T'y(p) we can express it as a rational 
function (in fact, a polynomial) in the basic functions of that group. When 
T,(p) is of genus zero* there is a univalent function the powers of which 
in linear combination generate all the entire modular functions of the group. 
This situation obtains when p= 5,7,13, and the univalent basic functions 


(2. 5) = = 


where r(p— 1) = 0, (mod 24) and r is minimal positive. Here (7) is the 
Dedekind function 


(2. 5)’ = exp (air/12) (1— 2”). 
1 
® is obviously regular at r =i, and near 7’ 0 we have the integral 
coefficient expansion, [2], (8. 83), 
(2. 6) (7) > O(x)), 7’ == — 1/pr. 


It follows in view of (2.4) that by proper choice of integers Ci, C2,°--, Cy 
we can make sure that the function 


PU pj — {Cup + k= p’, 


does not have a pole at 7’ 0. 


At r=100, U,j and © are both regular. Hence, with a suitable constant 
Cy, we conclude that the function 


8(7) = Upj(r) — + —Co 


is a modular function of I)(p), which is regular in the interior of the funda- 
mental region and at both parabolic vertices (0,100), and has a zero at 


—. 


* By the genus of a group we mean the topological genus of the closed manifold 
obtained by identifying the edges of its fundamental region which are congruent under 
the substitutions of the group. 


are 


140 JOSEPH LEHNER. 


t==10. By a well-known theorem, § vanishes identically. Thus we have 


established the identity 


From (2.2) and the last equation we write 


M2 


(2. 8) Copt* = Cy + S bra’, 
p=1 


0 


where the by are integers. 

Now for p=5, r=6; for p=7, r—4; for p=13, r=2. Hence, 
when p is 5, 7, the congruences (1.2), (1.3) follow trivially from (2.8). 
When p= 13, however, 7/2 — 10, so no obvious congruence follows. 


3. Congruences for the modulus 11. When we turn to p11, we 
face a new situation, for [¥)(11) is no longer of genus zero. Therefore, there 
cannot be a univalent function on this subgroup, but it is possible to define 
a pair of independent 2-valent functions, which together constitute a rational 
basis ([2], pp. 501-505). These functions have integral coefficient expansions, 
of which the leading terms are given below. At 2 = exp 2zir, 


A(r) 
C(r) 52? 


At 7’ =0, we put 7’ =—1/11r and have 


A(r’) =A(r) = 
11°C (r’) = 27? + 


(3.1) 


(3.2) 


We wish to represent 11U,,j as a polynomial in A and C. The procedure 
to be employed is explained in [2], p. 505. At r= to, 11U,,j is regular, 
as we saw in (2.3), but A has a pole. However, C and 11°AC are regular. 
At 7’ =0, A has a simple pole, and 11°C a double pole. Hence, we may 
remove the principal part of 11U1,j at 7’ = 0 by means of a polynomial with 
terms like (11°C)”, A-(11°C)™, the former being used for poles of even 
order, the latter for those of odd order greater than one. These terms will 
remain regular at r’ io because of the regularity of C and 11°AC there. 
Moreover, such terms all have integral coefficient expansions. We have not 
yet considered the term in a in the principal part of 11Ui,j at 7’ =0. 
However, this term must drop out since otherwise we could construct an 
entire modular function having but one simple pole on Ty)(11), a subgroup 


of genus one. The above considerations lead to an identity of the form 


THE FOURIER COEFFICIENTS OF THE MODULAR INVARIANT 


(3.3) Uij(r) =D 11{D,C(r) + (7) + Do112C?(r) 
+ (1) 0"(r)}, 
where ++, Dt, are integers. The expression in braces, 


when expanded in powers of 2, clearly has integral coefficients. The con- 
gruence (1.4) follows from this identity. 


4, Congruence for the modulus 11°. It is convenient at this point 
to indicate how the identity (3.3) can be extended to the modulus 11?. 
This is accomplished by subjecting both members of the identity to the 
operator U,,. On the left we obtain by (2. 2) 


U 417) (7) = (7) } = 
u=0 


In the right member of (3.3) every term contains the factor 11 to at 
least the second power except possibly D, in which we are not interested, and 
D,, D’;. Tf we now apply the operator U,,, which merely replaces each 
coefficient a, of a power series by di:n, and take congruences to the modulus 
11°, we evidently obtain 
(4. 1) (7) D 11D,U;,C(r) 11D’,01,A(r)C(r) (mod 11°) 

— Dy 11D,U,,C(r) 11D’,Uy,{A(r)C (7) 1} 
(mod 11?). 

It is proved in [2] ((9.1), p. 512 and Lemma 4, p. 504) that 

U,,C (7) =0 (mod 11) 
A(r)C(r) —1=0 (mod 11). 
Hence 


(4. 2) (r) = =D’ (mod 11’), 
0 


and the congruence (1.%) is proved. 


It is clear that the foregoing treatment cannot be extended to higher 
powers of 11. What is needed is a theorem about U1,C*(r), Ui1A (7) C* (7), 
and the higher iterates U,,?,U1:°,- - +. We would like this theorem to say 
that U,,C* is a polynomial of the same form as the expression in braces in 
(3.3) but with an additional factor of 11. 


141 
ee) 


142 JOSEPH LEHNER. 


When p = 5 or 7 it is possible to prove theorems of this general character, 
¢ will be shown in the two following sections. The reason is this: when 
p=5 and 7, [o(p) is of genus zero. There is a univalent function ® on 
T)(p). This function satisfies a modular equation, i.e., an algebraic relation 
between and ®(7/p). The iterates U,?*,- - -, can be calculated 
by applying Newton’s formula for the sums of powers of the roots of an 
algebraic equation to the modular equation, a device first used by Watson 
[4]. Since U,,j, Ui,7j,- + +, are expressible as polynomials in &, the applica- 
tion of Newton’s formula yields theorems of the type mentioned above. Thus 
it is possible to iterate the identity ¢2.7) indefinitely to obtain identities 
for U,"j, m = 1, 2,3,- --. This will be carried out in the next two sections. 
This approach unfortunately breaks down when p= 11, due to the non- 
existence of modular equations of the required type for the functions (* 
and AC*. 


5. Congruences with powers of 5 as moduli. Our next step will be to 
generalize the identity (2.7) so that it involves powers of p in the indices 
of the coefficients {cy}. This is accomplished by repeated applications of the 
operator Uy. We denote the iterate of Up by U,°, etc.: 


(5.1) = U9") (7) }, m = 1,2,3,° 
In the first place it is clear that 
oo 00 
U,7j = {27 + = Cpt", q=p’, 
=0 u= 
and in general, 
(5. 2) r—p". 
On the other hand, iteration of (2.7) gives 


(5.3) =Co + + Cope? ++ + ar}, 


In order that the identity (5.3) shall give rise to a new congruence to the 
modulus p?, it is necessary that U,®,- - -, p")/?U,®* should be divisible 
by p, at least. However, we also want expressions for U,®,- - -, which are 
in a form suitable for iteration of Uy», since we are interested in identities 
and congruences to the moduli p*, p*,- - -. 

The simplest way of meeting both requirements is to prove that the 
polynomial in brackets in (5.3) reproduces itself under the operation Up 


t 
( 
a 
R 
fe 
€ 
( 
k= p’. 
Ww 
(5 
(5 


THE FOURIER COEFFICIENTS OF THE MODULAR INVARIANT 143 


but picks up a factor p*, A= 1. This would be true for the polynomial if it 
were true for each power of ® separately. That is, we are led to consider 
the statement 


(5.4) 
AZ=1, +, u— 
where +, Cu,1 are integers. 
This statement, however is not true. It breaks down even for 1/1 as 


we shall notice later (cf. (5.11), (6.11)). 
We therefore proceed as follows. Rewrite (2.7) in the form 


(5.5) Upj = Bo + + + +> Byp**), 


the B’s being integers. We shall prove that under application of U, the 
above polynomial does reproduce itself and at the same time picks up a 
factor p*: 

= pM + Bo + By,ip’®"}, 
(5.6) = + Ba +> + 


A221, 122, 1. 


For this purpose we make use of the modular equation connecting ®(r) 
and @(pr). These equations have been given in the cases p=5,7 by 
Rademacher [3], (11.5) and (12.5). 

At this point we separate the cases p==5 and p—7, and consider the 
former only in this section. Putting 


(5:7) Z = ®(r) = {n(5r)/y(7) W = 5°@(7/5), 
we have from [3], 

(5. 8) =0, 

where 


(—1) pj = 5° 


(5. 9) b, = 63 bs = 63.5° b; — 5, 


bo = 52.5 6.58 


(5.8) results from replacing X and Y in Rademacher’s equation by 


144 JOSEPH LEHNER. 


Z(r) = Y(7/5), W(r) = X-1(7/5). 
The conjugates in W of (5.8) are clearly 
(5. 10) Wy = 5°@((r +A)/5), 0, 1, 2,3, 4, 
since replacing + by r + 1 leaves Z unaltered, by (2.5), (2.5)’. Hence, for 
the sum of the conjugates, we have 
4 
(5.11) U;®(7) > 1/5) 
A=0 
4 5 
= Wy = = 5 
\=0 I=1 
In order to facilitate further computation, we introduce the symbols 


Q = + - + 
where the a’s, b’s, s, and ¢ are integers (s=1,¢>1). P will denote a 


polynomial of ‘this type, not necessarily the same one at each appearance; 


likewise for Q. 
Obviously the sum and product of two P’s is again a P; this is not 
necessarily true for two Q’s. Moreover, every P is a Q. 


In terms of this notation we can rewrite (5.6) as 


U;® 


(5. 12) 
= DQ, 


On the other hand, we have from (5. 9) 


(5. 12)’ pi = 5°Q; 
then from (5.11) 
(5. 13) Ub = 


Comparison of (5.12) and (5.13) shows that we must take \—1. The 
first part of (5.12) is therefore proved, but we must still show 


(5. 14) 510.6! —= 5Q, 
Let 


(5. 15) Wak — + A)/8), 


The 


te 
( 
t 
oh 
th 
(5 
Si 
we 
(5 
Als 
(5. 


THE FOURIER COEFFICIENTS OF THE MODULAR INVARIANT 


Then 
N=0 
so (5.14) is equivalent to 
(5. 16) = 


This relation can be proved by the aid of Newton’s formula for the sums 
of powers of the roots of an algebraic equation. With the notation of (5.8) 
and (5.15), Newton’s formula is 


k 
(5. 17) S=2 k=1, 2, 3,° 
j=1 


with the usual conventions 
Po= Pi =), So=k. 


Although (5.16) is required only for / = 2, we shall find it convenient 
to establish it for all positive integral 7. In view of the recursive nature of 
(5.17) we shall evidently proceed by induction. For ] 1, (5.16) is con- 
tained in (5. 12)’, since, from (5.17), 8S; = 

In the process of establishing the inductive step in (5.17), we shall 
have to multiply the polynomials p; and S;_; and add the products. This 
operation will be facilitated if these polynomials are elements of a ring. For 
this purpose we choose the P polynomials defined above. 


We assume in the induction that 


(5. 18) Be §2+2(), r= i, 
Since evidently 

ii, 
we have 
(5. 18)’ Si, = 5°"*1P, n= 1,2,---,d—1. 
Also, let 
(5. 19) Pn = 5%P, 

= 9 @,=14 
Xo = 12 


The «& are obtained from (5. 9). 


10 


145 


146 JOSEPH LEHNER. 


If ]1=5 we rewrite (5.17) as 
1-1 
(5. 20) Si = — (— 1) 
j=1 
Then applying (5. 18)’, (5.19) to (5.20), we get 
1-1 
1 


From (5.19) we see that a; = 2j + 2; hence, 


524+2+2 (1-j)+1p + 52l+2 Pp  §21+2p — 521420), 


1 
proving (5.16) when 1/5. 


For /= 5, substitute (5.18)’, (5.19) in (5.17) and obtain 


Si > 5 24+2+2 (1-j) +1P — 521+3P 
j=l 
This proves (5.16) for all ]=1 and thereby (5.14) (for 722). In 
turn this establishes (5.12) with A=1, which is equivalent to (5.16) 
(p=5,A=1). If we operate on (5.5) with U; and make use of (5.6) 
we are enabled to write the identity 


(5. 21) U;7)(r) = Ao 53(A,®(r) A,5°@* (7) A,5'@'(r)), 
where the A’s and ¢ are integers. 


An immediate consequence of (5.21) is the congruence (1.5) of Theorem 
2 for «= 2. 

We can now continue in the same way to apply the operator U; to both 
sides of (5.21). Using (5.6) (with 41) one readily establishes by 
induction the identities 


(5. 22) U3") (r) Ao,m Ay mB (7) Az,md°®? (7) 


+: At,md*! (7) }, m= 1, 2, 3,° 


where Aojm,Ai,m;***;Atm are integers. Equation (1.5) of Theorem * 


follows at once. 


( 
( 
(( 
(€ 
(6 
(6 


THE FOURIER COEFFICIENTS OF THE MODULAR INVARIANT 147 


6. Congruences with powers of 7 as moduli. The developments of the 
preceding section can be extended without difficulty to the case p=7%. We 
shall therefore merely list the relevant relations using the same equation 
number as in 5, i.e., (5.13) — (6.13), ete. The proofs all go through in 
the same way. 


(6. 5) =By + + ++ + B, 
(6.7) Z—=O(r) =  W= 
(6. 8) = 0 
j=l 
(6. 9) = 
l=j 


(oe) 
| 
~2 


6 7 
(6 11) = 7° Wy) 7 
A=0 t=1 
(6 13) =70 
(6. 16) Si — ] 2, 3, 
k 
(6. 17) Sic = (— k=1,2,°--, 
j=1 
Ps== ==), So=k 
(6.18) Sn = 
(6. 19) Pa == 
3 Xs == 8 Az = 13 
= 5 As = 10 
aj = j 2 
S 4 yep 
1-1 


yj+2+1-j+1pP 4 W1+2P p 


— 7° 

b, = 176: 7? bs = 46-7 

b, = 845- = 4: 7" 
1 

‘i= 

1 


148 JOSEPH LEHNER. 


The identities which result from these considerations are 
(6. 22) (7) Ao,m 7™{A1m® (7) (7) 
+--+ ++ m = 1, 2,3,° 


the A’s being integers. This proves equation (1.6) of Theorem 2 and 
completes the proof of that theorem. 


ADDED IN Proor: Equation (1.7%) of Theorem 2 is valid for the modulus 


11%, in other words, the congruence 
cy =0 (mod 11°) if y= 0 (mod 11°), y= 1,2,3,° °°, 


is correct. The theorems required for the proof of this result are contained 
in a paper by the author, “ Proof of Ramanujan’s Partition Congruence for 
the Modulus 11°,” submitted to the Bulletin of the American Mathematical 
Society on Sept. 13, 1948. The required formulae are listed as equations 
(A) and (B) near the end of § 2 of that paper. 


NEW YORK CITY. 


BIBLIOGRAPHY. 


1. D. H. Lehmer, “ Properties of the coefficients of the modular invariant J(r),” 
American Journal of Mathematics, vol. 64 (1942), pp. 488-502. 
J. Lehner, “ Ramanujan identities involving the partition function for the moduli 
11%,” American Journal of Mathematics, vol. 65 (1943), pp. 492-520. 
3. H. Rademacher, “The Ramanujan identities under modular substitutions,” Trans- 
actions of the American Mathematical Society, vol. 51 (1942), pp. 609-636. 
4, G. N. Watson, “Ramanujans Vermutung iiber Zerfaillungsanzahlen,” Journal fiir 
die reine und angewandte Mathematik, vol. 179 (1938), pp. 97-128. 


bo 


t 
as 
CO 
al 
ty 
TO 
me 
[[ 
a. 
Cou 


LIE AND JORDAN TRIPLE SYSTEMS.* 


By NATHAN JACOBSON. 


The present paper is devoted to a study of subspaces of an associative 
algebra that are closed relative to the ternary operation [[a,b],c] where 
[a,b] ab — ba. Such systems—called Lie triple systems—arise in a natural 
way in the study of Jordan algebras and of Jordan triple systems. The latter 
are defined to be subspaces of an associative algebra that are closed relative to 
{{a,b},c} where {a,b} —ab-+ ba. In the first part of this paper we con- 
sider some general properties of such systems. The second half of our paper 
is concerned with the study of certain particular Lie and Jordan triple 
systems that have arisen in quantum mechanics. These systems have a basis 
*s9n and multiplication tables, respectively 


95], Gx] = — 9% 
+ = — gx — 8x5 Gi- 


The latter relations have been introduced by Duffin' and by Kemmer ? in the 
study of meson fields and there is an extensive literature on the representation 
theory of such systems. In this paper we consider an extension of this theory. 


I. Elementary Properties. 


1, Lie and Jordan triple systems. We recall that a subspace 2 of an 
associative algebra %f is a Lie algebra if & is closed relative to the commutator 
composition [a,b] ==ab—ba. Similarly, a subspace is a special Jordan 
algebra if it is closed relative to the Jordan product {a,b}—=ab+ ba. A 
typical example of a Lie algebra is the set of skew symmetric matrices of n 
rows and columns while an example of a Jordan algebra is the set of sym- 
metric matrices. We now define a Lie triple system (L.t.s.) &3 as a sub- 
space of an associative algebra that is closed under the ternary composition 
[[a,b],c]. Similarly, a subspace &; of an associative algebra will be called 
a Jordan triple system (J.t.s.) if it is closed relative to {{a,b},c}. Of 
course, any Lie (Jordan) algebra is a Lie (Jordan) triple system. On the 


* Received February 22, 1948. 
Duffin [6]. 
*Kemmer [10]. 


li 
S- 
6. 
it 
149 


150 NATHAN JACOBSON. 


other hand, the set of skew symmetric matrices is an example of a J. t.s. that 


is not a Jordan algebra. 
We shall obtain next two characterizations of Jordan triple systems. 


For this purpose we introduce the composition (abc) =abc + cba and we 
note the following relations: 


(1) [a,b], = (abe) — (bac) 

(2) {{a,b}, c} = (abe) + (bac) 

(3) {{a, b}, c} — {a, {b, c}} = [[e, a], 

(4) {{a, b}, + c}, a} — a}, b} = 2 (abc) 

(5)  {{a,a},a} —4a° 

= (abc) + (bac) + (acb). 

- TuroreM 1. A subspace 33 of an associative algebra is a Jordan triple 
system uf and only if either one of the following conditions holds: (1) Qs ts 
closed relative to (abc) =abc + cba, (2) Bs ts a Lie triple system closed 
relative to cubes. It is assumed that the characteristic of the underlying 
field is 2, 3. 


Proof. If 33 is a J.t.s., then by (4) Qs is closed relative to (abc). 
Conversely by (2) a subspace closed relative to (abc) is a J.t.s. Also (1) 
and (5) shows that a J.t.s. is a L.t.s. closed relative to cubes. Finally, 
let SY; be a L. t.s. closed relative to cubes. Then by (6) 


(abc) + (bac) + (ach) € 3s 


and by (1), (abc) — (bac) and (abc) — (acb) € 33. Hence 3(abc) eS and 
(abc) Thus is a Jordan triple system. 


Corottary. Any special Jordan algebra is a Lie triple system. 


A part of our result is that any J.t.s. isa L.t.s. Also it is clear from 
(1) or from (3) that if two J.t.s. are isomorphic then they are also is0- 
morphic as L.t.s. Hence we may speak of the Lie triple system of a J.t.3. 
In particular, we can associate a uniquely determined L. t.s. with any special 
Jordan algebra. Because of the relation (3) we shall also call this L.t.% 
the associator L.t.s. of the special Jordan algebra. 


( 

J 
| 
( 
A 
to 
b 


LIE AND JORDAN TRIPLE SYSTEMS. 151 


2. Examples. We consider now some important instances of triple 
systems. 


1) Meson systems. Let YM be the subspace of the algebra ®y,, of 
(n +1) X (n+ 1) matrices over the field spanned by the particular 
matrices 9; = @i,n41 — Mt isa J.t.s. since 


(7) (919 19x) = — — 
For the L. t.s. of Yt we have the multiplication table 
(8) gil, Gx] = — 8x5 


The systems Yt have been studied by Duffin, Kemmer and others in connection 
with the mathematical theory of meson fields.* For this reason we shall call 
these systems meson triple systems. As we shall see, these systems are con- 
nected with the Lie algebras Gn, of (n+1) X (n+1) skew symmetric 
matrices. Systems that seem to be related in the same way with the Lie 
algebras of matrices of trace 0 are given in the next example. 


2) Let & be the space spanned by the matrices xj = @i,nsi, Yi = Cns1,¢ 
Here we have 
(9) 


(YiXjYn) = + 


with all other triples (abc) =0. Thus & is a J.t.s. For the associated 
L. t.s. we have 
| [ yi], rx | 9 a. 8 542% 


10 
ay], yx] = + 


We consider next Lie triple systems that can be associated with arbitrary 
Jordan algebras. An algebra has been called a Jordan algebra by Albert * 
if it is commutative and its multiplication satisfies 


(11) (a*b)a =a*(ba). 


Any special Jordan algebra is a Jordan algebra if the composition is taken 
to be {a,b}. On the other hand, there exist Jordan algebras that cannot 
be obtained as special Jordan algebras. Let Ra denote the right multiplication 
t—>zxa in a Jordan algebra § and let ZL, denote the left multiplication 
Then since is commutative Ry = and by (11) RatRa = 


* See the bibliography. 
* Albert [2]. 


| 


152 NATHAN JACOBSON. 


Also, as has been shown by Albert,® the latter relation can be linearized 
to give 


(12) Riavye RaRve ReRav (Rak Ry 4. RiR-Ra) 
if the characteristic is 2,3. This implies 
(13) Rav) c-a(ve) Ry]. 


Thus if R(3) is the set of right multiplications operating in YJ then R(3) 
is a subspace of the algebra of linear transformations in the vector space 3 
closed under the Lie triple product [[A,B],C]. Hence R(3) is a Lie 
triple system. We shall call this system the multiplication L.t.s of &. 

If ¥ is a special Jordan algebra with an identity, the mapping a-— R, 
is 1 —1 and linear of § into R(J). Since 


{{a, b}, c} {a, {b, ch} [[e, a], b], 


(13) implies that 
= [[Re, Ra], Ro]. 


Hence a—R, is an isomorphism of the associator L.t.s. of <§ onto the 


multiplication L. t. s. 


8. Identities. We shall not attempt to obtain an axiomatic develop- 
ment of Lie or Jordan triple systems in this paper. However, we shall 
derive in this section a number of identities that may be useful for such a 
development. We consider first relations for [[a,b],¢] and these will be 
clearer if we denote [[a,b],c] by [abc]. 

Skew symmetry of [a,b] and the Jacobi identity imply 


(14) [abc] = — [bac] 
(15) [abc] + [bea] + [cab] =0. 
The Jacobi identity 
[[La, 6], fe, + e]; [a, 
+ [Le [a,b] =0 
becomes by using 


(16) [[a, 6], [e, = 6], 4] 2], 4], ¢], 
fe, 0], ¢], 2], e] + 2], e] + a]; [[e, 4]. e]] 
+ L[a, b], =0 


5 Albert [2], p. 299. Cf. also Jordan, Wigner and von Neumann [9]. An equation 
equivalent to (13) below is also given in Albert’s paper [2], p. 250. 


he 


tion 


LIE AND JORDAN TRIPLE SYSTEMS. 


or 
(17) [ [abe]de] + [[bad]ce] + [bal cde] ] + [cd[abe]] =0. 
Next we apply (16) and [[a, b], [c,d]] =—[[e, d], [a, 6]] to derive 


(18) [[[a, d] + a], d],c] + [ [fd 6), 4] 
+ [[[e, d], a], =0. 


Taking commutators with e we obtain 
(19) [ [abe]de] + | [bad]|ce] + [[deb]ae] + [[cda]be] = 0. 
Also it follows directly from (16) that 


[[ [a,b], Le. f]] = [[[ebe]de], f] + [[Lbac]df], e] 
+ [[[bad]ce], f] — [[[bad]cf], e]. 
Hence 
[LU la, Le, 2] Le. 1.9] = + [[Lbac]dfleg] 
+ [[[bad]ce]fg] + 


By Jacobi’s identity for [[2,y],z] where z= [a,b], y=[c,d], z= [ef], 


(20) [[[abe]de]fg] + [[[bac]dfleg] + [[[bad]ce]fg] 
+ [[[abd]cfleg] +O +R=0 


where Q and RF are obtained from the four displayed terms by cyclically 
permuting the pairs (a,b), (c,d), (ef). 

We establish next an identity which shows that if § is a special Jordan 
algebra, then [a,b]?eX for every a,be%.® It is well known that the 
mapping «—> [a,b] has the formal properties of differentiation. Hence 


[a?,b] =ala, b] + [a, b]a 


and 

[[a?, b], b] = 2[a, b]? + a[[a, 6], 6] + [[a, b], bla 
or 
(21) [a*bb] = 2[a, b]? + {a, [abd ]} 


since is a L.t.s. [ubb] and [a*bb] Hence by (20), [a,b]? is in &. 

We consider finally the function (abc) regarded as a function of a and ec. 
We set (abc) — {a,c}. Then this product satisfies the conditions for the 
product in a Jordan algebra, that is 


° This property is mentioned without proof in Jordan [8]. 


153 
) 
ll 
a 
| 


154 NATHAN JACOBSON. 


(22) {a, chy fc, ay 
(23) {{{a, cho, ayo = {{a, {c, 


These can be verified directly, or, we can obtain them from the following use- 
ful observation. If Xz is a Jordan triple system and b e X;3 then the set S3b of 
multiples ab, ae 3s is a special Jordan algebra. For (abc)b = {ab, cb} © Sy). 
Similarly bY; is a special Jordan algebra. 


II. Universal Algebras. 


4, Universal associative algebra of a system of equations." Let X bea 
set of non-commuting indeterminates xq and let §[1]| be the free polynomial 
algebra generated by the 2’s. A precise description of §[1] is the following. 
Let X be the free semigroup generated by the zg. Thus the elements of X 
are the monomials 2;,0;,° * -%,, vi,e X and two of these are regarded as 
equal only if they look alike. Finally multiplication is defined by 


We can now define %[X] to be the semi-group algebra over our field ® of 
the semi-group %. 

If § is an arbitrary subset of §[X] and 8 is the two-sided (®-) ideal 
generated by S, then we shall call the algebra 1 = 3[X]— 8% the universal 
associative algebra of the set of equations 


(24) P(X, T2,° tr) = 0, 


for pe S. For the sake of simplicity we shall also denote the coset x; + % 
by 2;. Then any element of U is a polynomial in these 2; and the relations 
(24) hold. Thus U can also be regarded as the algebra obtained from the 
free algebra by imposing the above relations. 

Suppose now that % is any associative algebra over ® that contains 
elements yq such that p(y:, y2,° * *,Yyr) =0 for every pe S. We shall call 
the subalgebra generated by the y’s the enveloping algebra € of the y’s. 
Evidently € is the totality of polynomials in the y’s with coefficients in ®. 
It is easy to see that the correspondence xg—>y_. determines a unique homo- 
morphism of U onto €. 

We apply these results now to Lie and Jordan triple systems. Let & 
be a L.t.s. with basis Y = {yg} and multiplication table 


7 An alternative discussion of universal algebras is given in Birkhoff and Whitman 
[3]. Cf. also Jacobson and Jacobson [7] for the case of special Jordan algebras. 


LIE AND JORDAN TRIPLE SYSTEMS. 


(25) Ll ya, ye], = 
We form the universal algebra U of the set of polynomials 
[[ 2a, xe], | — 


Then since the mapping 2v— Yq determines a homomorphism of U on the 
enveloping algebra & of the y’s, it is clear that the 2’s are linearly independent 
in Ut. Since [[@a, vg], ry] = Suapysvs in U, it is also clear that the space 
spanned by the 2’s is a Lie triple system isomorphic to %;. We shall now 
replace 2; by this subspace of U and we shall change our notation and denote 
the latter system as &;. 

Suppose now that a—>4 is a homomorphism of &, on &; a Lie triple 
system contained in the associative algebra 9. Then if Z, is the correspondent 
of ta, [[%a, Ze], fy] Hence, the correspondence xg can be 
extended to a homomorphism of MW onto the enveloping algebra & of the 4. 
Evidently this homomorphism maps a into a Also it is clear that & is the 
enveloping algebra of &,. Thus, we see that the homgmorphism of &, onto 
2, can be extended in one and only one way to a homomorphism of 1 onto 
the enveloping algebra & of &. 

The properties of 11 that we have noted are characteristic. For let WU 
be any associative algebra that contains a Lie triple system &, isomorphic 
to 2, under the correspondence a—>d. Suppose that 1) WU is the enveloping 
algebra of @, and, 2) any homomorphism é— 4 of @, ontoa L.t.s. & can 
be extended to a homomorphism of 1 into the associative algebra that defines 
2,. Then the correspondences a—>d& and @—>a can be extended to homo- 
morphisms of 11 onto UW and of 1 onto 1 respectively. It follows that both 
correspondences are isomorphisms. In this sense, UU is uniquely determined. 
In particular, we see that 1 is independent of the choice of basis. We shall 
call U the universal associative algebra of &. 

In a similar fashion we can define the universal associative algebra of a 
Jordan triple system. In fact, it is clear that our discussion is applicable 
to other algebraic systems that can be defined by using polynomial functions 
in an associative algebra. 


5. Universal Lie algebra of a Lie triple system. Let &; be a Lie triple 
system and let 9 be the subspace of the given associative algebra that is 
spanned by the vectors [a,b], a,b in &;. If we refer to the relation (16) 
and recall that [[a,b],c] and [[a, b], d] for a,b,c, d in &;, then we see 
that N is a Lie algebra. Also the subspace 8 = 2, + M is a Lie algebra since 


156 NATHAN JACOBSON. 


[a, ble if a and be and [[a, b],c] eX, if [a,b] eM and ceX,. Clearly 
© is the smallest Lie subalgebra containing 2;. We shall call & the enveloping 
Lie algebra of &;. It is also clear that & is finite dimensional if &; is finite 
dimensional. In fact, if y2,- +, Yn is a basis for then the elements 
yi, Lyis yi], i<j are generators for 2. Hence the dimensionality of & does 
not exceed n(n + 1)/2. 

We shall call 2 a universal Lie algebra for 2; if any homomorphism 
a—da of & onto a L.t.s. &; can be extended to a homomorphism of the 
enveloping Lie algebras. It is evident that if such an extension exists, then 
it is unique. Now it is easy to obtain a universal Lie algebra for any L. t.s. 
2,. For this purpose we assume that &, is contained in its universal asso- 
ciative algebra Ul. This can be arranged by replacing &, if necessary by an 
isomorphic copy of it. It is clear from the properties of the universal 
associative algebra U1 that the enveloping Lie algebra 2 of 2; in U is a 
universal Lie algebra. Also, it is clear that if a—4@ is an isomorphism of &; 
onto &, and if the enveloping Lie algebras 2 and & are universal, then the 
extended homomorphism is an isomorphism. In this sense the universal 
Lie algebra is uniquely determined. 

Suppose now that &, is finite dimensional and that & is its universal 
Lie algebra. Let a— 4 be a homomorphism of &; on @, and assume that the 
enveloping algebra 2 of &, has the same dimensionality as &. Then the 
extended homomorphism of 2 on & must be an isomorphism. Hence, in this 
case @ is universal for Q.. Thus, the universal Lie algebra for a finite 
dimensional L.t.s. can be characterized as an enveloping Lie algebra of 
maximum dimensionality. 

If 2 is the universal Lie algebra of 2, the subalgebra 9 of elements of 
the form [a,b], a, b in &; will be called the complementary Lie algebra of &. 
We proceed now to determine the universal Lie algebras and complementary 
Lie algebras for the examples listed in 2. 

For the meson Lie triple system we have the basis gi = @i,ns1 — nit, 
Hence [gi, 9;] = ej: —ei;. The elements gi, [gi,9;] are linearly inde- 
pendent and constitute a basis for the Lie algebra Sy,, of (n-+1) X (n +1) 
skew symmetric matrices. It follows that G,,, is universal. The comple- 
mentary Lie algebra has the basis e;; — eji, 1,7 j. Hence 
it is isomorphic to the Lie algebra Gp. 

We consider next the Lie triple system & with basis 7 = @é,ns1, Yi = Cnt, 
t—1,2,---,n. Here [2i,2;] =0 = [yi, y;] and yj] = — 
Thus the Lie algebra 2 spanned by the vectors [a, b], a, b in & has the basis 
and the enveloping Lie algebra & has dimensionality 2n +” 


te 


i 
e 
8 
e 
0 
b 
0] 
th 
te 
of 
0c 
pl 
sp 
ar 


LIE AND JORDAN TRIPLE SYSTEMS. ; 157 


=(n+1)?—-1. It is clear that 2 coincides with the Lie algebra ©’n,1,1 
of matrices of trace 0. Now let & be any L.t.s. that is isomorphic to & 
and let Zi, Yj be a basis corresponding to the basis 2;, y;. If we use the 
multiplication table (10) and the identity (18) for a=ai, b=9j, c= &, 
d=, we obtain 


| 8 jx[ Zi, — [ Zi, Ex | == (), 


If we set i=j =k and we obtain 3[%, —0. Hence Z:] 0. 
Similarly, [:, 9] 0. It follows that the dimensionality of the enveloping 
Lie algebra & of & does not exceed (n+1)2—1. Hence 2=@'n411 is 
universal for &. The complementary algebra has the basis ei; — 8¢j@ns1,n+15 
i,7=1,2,---,n. Hence it is isomorphic to the Lie algebra ®,; of all 


n X n matrices. 


6. Finiteness of dimensionality of the universal associative algebra 
for a Jordan triple system. Let %; be a Jordan triple system and suppose 
that $3; is contained in its universal algebra 11. We wish to show that if 3s 
is finite dimensional, then 11 is finite dimensional. Let 2, be a 
basis for 3. We shall call a monomial in the 2x; irreducible if it cannot be 
expressed as a linear combination of monomials of lower degree. Also we 
shall call two irreducible monomials equivalent if their difference can be 


expressed in terms of lower degree monomials. 


Lemma. An irreducible monomial is of degree <= max (2,n) im any 


of the aj. 


Let uw denote one of the 2’s and let the monomial have the form 

“Urs sure Tf a term appears it can be replaced by 
—wuxjx; + linear terms. This process can be used to replace the monomial 
by an equivalent one in which any uw appears two places to the left of its 
original position. Eventually we can get an equivalent monomial in which 
the w’s are separated by at most one 2;4u. Three w’s cannot occur con- 
secutively since w® can be expressed as a linear combination of 2’s. Also no 
terms of the form waju? or can occur, since = — + terms 
of lower degree and u?a;u ——aju® + terms of lower degree. Thus if wu? 
occurs, then no other w appears in the monomial. Finally we consider a 
product of the form (wa;)(uaj)- (war). We have seen that is a 
special Jordan algebra and it is clear that the vectors uaj, i=1,2,--°-,n 
are generators. Thus —=— (uax;) + terms of lower degree. 
Thus we can permute the factors uz;. But also (ua;)? is a sum of terms of 


158 : NATHAN JACOBSON. 


lower degree. It follows that no more than n factors ua; can occur. This 
concludes the proof. 

Evidently this lemma implies that there are only a finite number of 
irreducible monomials. Hence we have the following. 


THEOREM. If 3; is a Jordan triple system of finite dimensionality, then 
its universal associative algebra UU is finite dimensional. 


Of course, this implies that the enveloping associative algebra of any 
J.t.s. with a finite basis has a finite basis. 


III. Representation Theory for Meson Triple Systems. 


7. Svartholm’s problem. With a view of applications to quantum 
physics, N. Svartholm ® has proposed the problem of determining the struc- 
ture of the universal associative algebra of the system of equations 


(26) xi, 25], ve] = — 
(27) =0 
where 1, =1,2,- and ¢(A) is a non-zero polynomial in $[A]. It 


was shown by Svartholm that if (A) is of degree 2, then (A) =A?+ 4 
and U is an algebra of Clifford numbers.® If ¢(A) is of degree 3, then 


$(A) +A, so that 73 By (26) if ij 
(28) — = — YH. 
Hence 


— = 0 and — + 0. 
Hence = 0 = 2;*2;2;. Multiplication by 2;, gives 
(29) = 0, tj. 
Thus (28) becomes 
(30) + = — 
Next let 1, 7,4 be 4. Then by (26) 


§ Svartholm [14]. 
®Svartholm [14]. We assume in the sequel that our minimum polynomials have 
no constant term. Using this convention the minimum polynomial in this case should 
be regarded as A* + 7A. 


ve 
Id 


LIE AND JORDAN TRIPLE SYSTEMS. 159 


If we multiply on the left by zi? and use (29) and a;* = —a;, we obtain 
(31) — + = 0. 
Since 


(31) implies that 
Thus we have the equations 


for all i, j,k. Conversely, these equations imply (26) and 2;°—=—2;. Thus 
we see that if ¢(A) =A®*+A then U can also be defined as the universal 
associative algebra of the meson J.t.s. The structure of this algebra has 
been determined by Svartholm using associative methods. In the remainder 
of this paper we shall determine the structure of U1 for arbitrary (A). In 
fact, we shall replace (27) by the weaker requirement 


(34) =0. 


Thus let U be the universal algebra of (26) and (34). We assume that 
® is an algebraically closed field of characteristic 0. Let 7,=—2, 72=—y, 
[a,,72] =z. Then it follows from (26) that 


(35) [ye]—2, 

If we set hi = V— 12, emy+ V—1z, f=y— V—1z, then 

(36) [hi, e] =e, f] =—f, [f, e] = 

Evidently if y(A) = ¢(— V— 1A) then y(hi) =0. Also by (36) if g(A) 
is any polynomial, then 


[g(hi),e] =eAg(hi),  Ag(A)=g(A+1) —g(A). 


Iteration of this equation gives 


r 
& [g(h1), e], ‘ = e"A'g(h,). 


Now take g(A) =y(A) and let r be the degree of y(A). Then 


| 
| 
| 

| 


160 NATHAN JACOBSON. 


Since A’y(h,) =r! this implies that e’ 0. In a similar manner we can 
prove that f-—0. We now note the following. 


LemMa. The universal algebra of any system of equations of the form 
(ai, = (zi) =0, i, 7,4 =1,2,---,n has a finite basis. 


Proof. We define irreducible monomial and equivalence as in the pre- 
ceding section. Let ¢:(A) be of degree r. Then we assert that any irreducible 
monomial is of degree <7; in 2. For if we use the relation aaj = 2jx; 
+ Syijxt_ we can permute the z’s and replace our monomial by the equivalent 
one Also a;7* is a linear combination of the terms 2j, 

Hence each ky < yj. 

It follows from this lemma that the enveloping algebra of h,, e, f is finite 
dimensional. Since this algebra includes 2, and [2,22], these elements 
satisfy algebraic equations. In a similar manner we can prove that every 2; 
and every [2;,2;] is algebraic. Since the space determined by these elements 
is a Lie algebra the enveloping algebra Ul of these elements is finite dimen- 


sional by the lemma. This proves the 


THEOREM. The universal algebra of the set of equations (26) and (34) 
is finite dimensional. 


We can now conclude that 1 has a 1 — 1 representation by finite matrices. 
If x; > X; in a representation of 11 then x; > Xi, [wi, xj] > [Xi, Xj] deter- 
mines a representation of the Lie algebra ©,,, of skew symmetric matrices. 
For we know that the universal Lie algebra of Yt is Gui. Also we have the 
relation ¢(X,) = 0 in any representation of 1. Conversely, if Yj ina 
representation of G,,, and ¢(X,) = 0 then the X; satisfy (26) and (34) and 
therefore they define a representation of U1. Now it is well known that any 
representation of ©G,,, is completely reducible. Hence we see that there 
exists a completely reducible 1—1 representation of 11. This implies that 
ll is semi-simple and since ® is algebraically closed, U1 is a direct sum of 
complete matrix algebras. Also we know that the simple matrix components 
of 1 are in 1—1 correspondence with the different similarity classes of 
irreducible representations of U1. Now it is clear that a representation of U 
is irreducible if and only if the associated representation of Gn. is irreducible. 
Thus we have reduced the problem of determining the structure of U to that 
of determining the irreducible representations of Gn. for which ¢(X,) = 0. 
Explicitly we have the following relation: Let R,, R2,- + +, Rs be irreducible 
representations of S,,, satisfying ¢(X,) —0 and suppose that the FR; are 
not similar and that every irreducible representation for which ¢(X,) = 0 is 


LIE AND JORDAN TRIPLE SYSTEMS. 161 


similar to one of the R;. Then if N; is the degree of Ri, U is a direct sum 
8 

of s complete matrix algebra ®y,“. Hence the dimensionality of U is } N,?. 
1 


In the remainder of this paper we shall apply the known theory of 
representations of ©,,, to determine the degrees Nj. 


8. Representation theory for ©,,,..° We distinguish odd and even 
dimensionality and accordingly we take n-+1—2v+1 and n=1= 2p. 
In place of Gn, we consider first the Lie algebra Sn, of matrices A such 
that S-1A’S = — A where 


(37) |0 Wb]; 
| ly 0 
in the two cases. The Lie algebra Gn. is the set of matrices 
0 | —b, —bd, 
Qi1 Me 
(38) by | G12} » 
b | @21 
where 


It follows that the subset of matrices 


h Av 


(40) 


> 


is a maximal commutative subalgebra §. Also there exist elements eg € Oni 
such that 
[h, eq] = 


[ea, =hae§ 


10 The results stated in this section are due to E. Cartan [5] and H. Weyl [15]. 
All of them can be found in Weyl’s second paper, pp. 342-353. 


11 


Ax 


162 NATHAN JACOBSON. 


where the roots « are 
(42) for n+1—2%+4+1 
@ee try t<j for n+ 1 = 2p. 


If we have a representation of S,»,, we can replace it by a similar one 


in which h is represented by 


(43) 


where the A; are linear forms in the A;. These forms are called the weights 
of the representation. If A —mj;A; is a linear form in the Aji, the value of 
A for the A; that give hg is denoted as Ag. Then it is easy to see that 
Ag=tm+m; if +A; +A; and Ag—+ 2m, if +r. In par- 
ticular, % = 2 in either case. It is known that if A is a weight, then 2Aq/a% 
is an integer and A— (2Aq/%q)a is also a weight. It follows that the 
coefficients of any weight are either all integers or all halves of odd integers; 
and if A is a weight, then any linear form that can be obtained from A 
by an arbitrary permutation of the A;, or by replacing any A by —Ai, 
n+ 1 = 2v—1, any even number of the A; by — Ai, n+ 1 = 2p, is a weight. 

One orders the weights lexicographically by stipulating that A = 3midi 
> 0 if the first non-zero m; is >0. Then it is known that an irreducible 
representation is determined up to a similarity by its highest weight. For 


this weight we have 


mM, = n+1—2%+1 


44 
mM, = mM n+ 1 = 2p. 


Conversely, if A = mjd; satisfies (44), then A is the highest weight of an 
irreducible representation. Hence there is a 1—1 correspondence between 
the similarity classes of irreducible representations and the linear forms 


satisfying (44). The degree of the irreducible representation with highest 
weight Smid; is 

P(h, loy* ly) 

N M1, Mo, ° My) = 

(45) U2,* +, Uv) (wi? — 
i<i 
= mi +v— (21—1/2), n+1—2%+1 


als 


Ay 
Ap 
h 
1 
| t] 
| i 
te 
| 
in 
(4 
eit 
ty 
te 
(4 


LIE AND JORDAN TRIPLE SYSTEMS. 


Pt, lv) 
P’(v—1,:- -,1,0) 


(46) Uv) = II (ui? — u;”) 


t<j 


=m + v—41, n+ 1 = 


+, My) = 


We recall also that all the weights of an irreducible representation are 
obtained from a particular one by adding roots. Hence, if one of the weights 
has integer coefficients, all have integer coefficients. Finally if A is the highest 
weight of a representation and > 0 then 


A, A— a, A—2a,- + 


are weights of the representation. 

We consider now the special case of S,. Here we have the roots Ax, 
—, and a basis hi, e = e,, f =e, so that (36) holds. Let h —d,h, and 
h— H in an irreducible representation with maximum weight m,\;. Then 
m,—=0 and Hence 


(m,—1)A, (my — — Ay 


are weights of the representation. It is easy to see from the results quoted 
that these are the only weights that occur. The number m, is either an 
integer or half of an odd integer. The degree of the irreducible represen- 
tation is the odd integer 2m,—41 in the former case and the even integer 
2m,—1 in the latter. Correspondingly the minimum polynomial of the 
matrix H, representing h, is 


4” (A? — m,”) + (A®—1)A, if m, is an integer 
(A? — m,”) (A? — —1)*) + (A? 4), if m, is half odd. 


If we have a second polynomial of the form (47a) determined by the 
integer m’, S m, then it is a factor of (47a). A similar remark holds for 
(4%b). It follows that the minimum polynomial of H, for any representation 
either has the form (47a), (47%b) or is a product of a polynomial of the first 
type by one of the second. Thus the minimum polynomial without constant 
term 11 is 


(A? — (A? (m,—1)?)- (A?—1)A, m, an integer 
(A? — m,?) (A? — (m,—1)*) (A? — F)A, my half odd 


(48) = 


integer 
(A? — m,”) (A2—1)A(A2— m2)? (A? — 9), half odd. 


1 From now on the minimum polynomials will be taken in this sense. We assume 
also that leading coefficients are always 1. 


163 
| 

1e 

Ls 
yf 
ub 
Aa, 
1e 
A | 
iy 
t. | 
Ni 
le | 
n 
18 
st 


(49) 


Then 


(50) 


161 NATHAN JACOBSON. 


The three possibilities occur according as (a) the irreducible components all 
have odd degrees and 2m,+ 1 is the maximum one, (b) the irreducible 
components all have even degrees and 2m, -+ 1 is the maximum one, (c) both 
odd and even degrees occur for the irreducible components and 2m, + 1 is 
the largest odd and 2m. + 1 the largest even one. This result gives a simple 


proof of the following. 


Lemma. Jf X, Y and Z are three matrices such that [X,Y] =Z, 
[Y,Z] =X, [7,X]=—Y then the minimum polynomials without constant 
term of X, Y and Z are all equal and have the form $i(A) = ywi(V—1A), 
y @ constant. 

Proof. As in (35) and (36) we set = A E=Y+ 
F=Y—vV—1Z. Then [H,,F]=E£, [H,,F]=—F, [F, 
Hence h, > H,, e—> EF, f > F defines a representation of S, and the minimum 
polynomial of H, is given by (48) and is determined by the degrees of the 
irreducible constituents of the representation. In the same way, we can set 
H, = V—1Y and V—1Z in tum. This shows that V—1X, V—1Y, 


V—1Z have the same minimum polynomial y¥,(A). It follows that Y, Y, Z 


have the minimum polynomial ¢;(A) = 


9. The structure of 11. We consider now an irreducible representation 
of As before we set gi = Ci,nsi— t=1,2,°°°,m, so that 
[gi,9;] —ei;. Hence if t=—gi, and z=—[gi,g;] then 
[x,y] =z, [y,.z] [2,2] =y. If gi Gi in our representation, then 
the above lemma shows that all the G; and the [Gi,G;] have the same 
minimum polynomial ¢,(A) = y¥i(V—1A) where y,(A) is given by (48). 
We shall now show that the third possibility given in (48) is excluded for an 
irreducible representation of Gn... For this purpose we introduce the matrices 


0 
M’M | 2], MM 0 
0 


It follows that the mapping A > MAM" is an isomorphism of Snar OD Sra. 
This isomorphism sends the matrix h given by (40) into 


wah, 


| 


LIE AND JORDAN TRIPLE SYSTEMS. 


0 | | 


—V—1nr 


Thus the matrix 
SAiV—1 — Ci,vsi) 


plays the role of h. Hence we can suppose that the matrix H corresponding 
to this matrix in our representation has the form (43) where the weights A 
satisfy the conditions given in the preceding section. 

In particular we see that the coefficients of the weights are either all 
integers or all half-odd integers. If we set A; —1,A2—=-+ + -—Av—0, we 
obtain the matrix H, corresponding to V¥—~ Hinks, — €;,v41). Correspondingly 
the weight 3m; becomes the characteristic root m, of H,. Thus the charac- 
teristic roots of H, are either all integers or all half-odd. Hence we have 
proved that either (48a) or (48b) holds. Also we see that m, in the formula 
is the same as the m, in the highest weight A = SmjAi. 

We return now to the problem of determining the structure of the 
universal algebra U1 of (26) and (34). As we have shown this problem can 
be solved by determining the irreducible representation of Gn,, for which 
¢(X,) =0 where 7;—9,—4X;. Because of the form of the minimum 
polynomial for the matrix corresponding to g, in an irreducible representation, 


165 
| 

0 
| 
| 
f | 0 
L 
, | —V—1A,, 
| 
1 
t ° 
0 
n 
t 
| 
n 
1h 
1° 


166 NATHAN JACOBSON. 


we can easily reduce the consideration of the UU determined by any ¢(A) 
to the two cases in which 

(51) $(a)— (A? + m?) (A? (m+ ++ (A?+1)A, m an integer 

(A? + m?) (A? + (m+ 1)?)- (A?+4)A, m half-odd. 

From now on we denote the algebra associated with this polynomial by Un, 
m = 4,1,3/2,---. Now the minimum polynomial ¢,(A) must be a factor 
of (A). Hence we see that for the irreducible representations we seek ¢, (A) 
has the same form as ¢(A) with m, in place of m and m, =m. Our results 
show that the maximum weight A = 3mj;A; satisfies (44) and 


(52) m, =m, m, =m (mod 1). 


The degree of the irreducible representation is given by (45) and (46). 
This proves the following 

THEOREM. Let Um be the universal associative algebra over an alge- 
braically closed field of characteristic 0 of the system of: equations (26) and 


(34) where o(A) is given by (51). Then Un is a direct sum of complete 
matrix algebras that are in 1—1 correspondence with the sets of linear 


forms >} midi, v= [n+ 1/2], such that 
i=1 
m= mM, = m =— |), 
mi, 


m; =m (mod 1). 


The dimensionality of the matrix component corresponding to the form 
Smid; ts +, mv)? where N(mi, +, mv) ts given by (45) and 
(46). 

Remarks. If m is an integer, the linear form 0 is excluded since in 
the corresponding representation of Sn,, and of WU» the representing elements 
are all 0. The dimensionality of Un is, of course, mv)? 


10. Meson Jordan triple systems. We shall now see how our results 
specialize to give the known structure of the universal associative algebra of 
the meson J.t.s.12 As we have seen in 7 this algebra is the universal 
algebra U1, Hence the possible maximum weights for the irreducible repre- 


sentations are 


12 Syartholm [14] and D. E. Littlewood [13] employ associative methods to derive 
these results. Cf. also Kemmer [11]. 


Jc 


| 
| 
( 


nv 


LIE AND JORDAN TRIPLE SYSTEMS. 16% 


53) 
Ar, Ai + Az; rv, Ar +: — Av, n+ 


It is known that the associated irreducible representations are those whose 
representation spaces are the spaces of skew symmetric tensors of ranks 
1,2,°°°,v, if n+ 1—2vy-+1 and the spaces of skew symmetric of ranks 
1,2,: - *,v—1 plus the two irreducible spaces into which the skew symmetric 
tensor of rank y splits if »-+1—2v: For the sake of brevity we shall refer 
to these representations as the irreducible tensor representations. The number 
of simple components of 11, is v or y-+ 1 and their degrees are 


1 2 v 
1 v—l v v 
Hence the dimensionality of 11, is 
n+1\ 
qn = 1 
1 


2 2 
Using the formula 2 ( we see that in either case 


+("F*); 


v—1 


i=0 


f2n+1 
(55) dn = ( 
11. A special 1-1 representation of Let po, p:, pn be gen- 
erators of the Clifford algebra T satisfying the multiplication table 
(56) Pr° = Pos PoPr = Pr = PrPo» 
PiPj = — PiPir +, 1, +, n™ 


Then p) =1 is the identity in T and we can verify that 


(52) pil, pe] = — 2 — 

= pi, 4, j, 1,2,---,n 
Hence if we set ¢: = 4V—1 pi then 
(58) gi), = — ge +iq =0. 


18 This algebra may be regarded as the universal associative algebra of a certain 
Jordan algebra. Cf. Jacobson and Jacobson [7]. 


168 NATHAN JACOBSON. 


Also it is clear that the enveloping algebra of the gq; is T itself, and as is 
well-known the dimensionality of T is 2”. 

On the other hand, let us consider now the universal algebra 1, associated 
with the polynomial ¢(A) =A*+ 4d. The irreducible representations of 
this algebra have maximum weights 


$(A, n+1=—2%+1 

and are the well-known spinor representations of Gn,:.14_ The dimensionality 
of Ul, can be computed by the formulas to be (2”)? = 2" if n+1—2y—1 
and 2(2”7)?=— 2" if n4+1—2y Now by (58), the gi of T satisfy the 
relations imposed on the generators 2; of ll; Hence since T and Uy have 


(59) 


the same dimensionality we can identify x; with gi and regard T as the 
universal algebra 11. 

Our results now give the known structure of T: If n+1—27+1, 
T is a complete matrix algebra of 2” rows and columns and if n+ 1 = 2p 
then T is a direct sum of two complete matrix algebras each of 2” rows and 
columns. In either case T has a 1—1 representation S by matrices of 2’ 
rows and columns.'® We propose to show that this representation can be 
used to give a 1—1 representation for 11, for any m. 

We recall that if R, and R, are two representations of a Lie algebra 2 
of respective degrees n, and n, then 


(60) a— == ah & 1,,+ 12, XK a 


is a representation of 2 of degree ninz. This representation is analogous to 
the direct product representation for groups and directly corresponds to the 
latter for Lie groups. It is evident that if R, is decomposed as 


Ry 
a 0 
0 aku 


then is decomposed as 


X Inet Any X a ) 
0 aki x In, + x aR 


where ,; is the degree of R,;. Since aX b is similar to b Xa a similar 
distributivity can be proved for the second factor. Hence, if R, is similar to a 
direct sum, ~ Ry + Rye ++ Ris and Ry ~ Ro, + Ro + + Rat 
then R, X ~ K 


14 Cartan [5], p. 86; Brauer and Weyl] [4]. 
15 Explicit formulas for the representing matrices are given in Brauer and Weyl 
[4], p. 429 and p. 433. 


LIE AND JORDAN TRIPLE SYSTEMS. 169 


Also it is clear that if h is represented by the diagonal matrix h* with 
diagonal elements A; and by the diagonal matrix h”: with diagonal elements 
M; then h®X®: jg a diagonal matrix with diagonal elements A; + Mj. 

In particular we consider the representation S of Gn, (or of U,). The 
weights of S are the 2” linear forms + $A; + $A, +--+ -+ 4d. Hence the 
weights of the representation S? = S & S are the linear forms 0, + A; + Az 
+::++,, kv. The ones of these that can be maximal are 0,A; + Az 
for n+1—2v+1 and kSv and 
Ate —Av for n+ Hence, the possible irreducible con- 
stituents of S? are 0 and the irreducible tensor representations. It is known 
that all of these actually do occur. Hence the representation S? has the same 
irreducible constituents 40 as the universal algebra U, of the meson J.t.s. 
It follows that S? gives a 1—1 representation of 1l,. We have, therefore, 
established the case m= 1 of the following 


TruErorEM. Jf S denotes the 1—1 representation of Gn, determined 
by the spinor representations, then the product S°™*® of 2m representations 


equivalent lo S gives a 1—1 representation of Um, m = },1,3/2,°--. 
Proof. Since the weights of are + 4A, the 


weights of the 2m-fold product of S by itself have coefficients that are 
== m (mod 1) and are in absolute value = m. Hence the maximum weights 
of the irreducible constituents satisfy the conditions of the main theorem (9). 
It remains to show that every irreducible representation satisfying these 
conditions is a component of S?”. We know that this holds for m = $ and 
m = 1 and we assume that it has already been proved for m—1. Thus, any 
linear form satisfying the conditions of the main theorem for m— 1 in place 
of m occurs as the maximum weight of an irreducible component of S?”-?. 
AlsodA, +A. +++ 
are maximum weights of the irreducible components of S*. Now let 
A =3m;A; satisfy the conditions of the main theorem for the number m. 
Then A=A’+ (A, for some K, or 
+ Av1+—Av), n+ where A’ is maximum weight of an irreducible 


component in S*”-°. Hence A is the maximum weight of a product of an 
irreducible component in S*”-? and an irreducible component of S?. It follows 


*° This type of representation of ©,,, has been considered by Kramer, Belifante, 
and Lubanski [12] in generalizing spinors to obtain quantities called undors. <A deter- 
mination of the irreducible representations contained in S XS X.-.-- XS for the 
case n = 4 has been given by Lubanski [14]. 


1S 
d 
yf 
v 
y 
1 
1e 
de 
1, 
id 
2” 
Q 
~ 
to 
he 
lar 
a 
Lot 
eyl 


17 


NATHAN JACOBSON. 


that A is a weight of an irreducible component of S*”. This completes the 


proof. 


~I 


15. 


16. 


YALE UNIVERSITY. 


BIBLIOGRAPHY. 


A. A. Albert, “On Jordan algebras of linear transformations,” T’ransactions of the 
American Mathematical Society, vol. 59 (1946), pp. 524-555. 

——., “A structure theory for Jordan algebras,” Annals of Mathematics, vol. 48 
(1947), pp. 546-567. 

G. Birkhoff and P. Whitman, “ Representation of Jordan on Lie algebras,” 
to appear in the Transactions of the American Mathematical Society. 

R. Brauer and H. Weyl, ‘“ Spinors in » dimensions,” American Journal of Mathe- 
matics, vol. 57 (1935), pp. 425-449. 

E, Cartan, “ Les groupes projectifs qui ne laissent invariante aucune multiplicité 
plan,” Bulletin de la Société Mathematique de France, vol. 41 (1913), 
pp. 53-96. 

R. J. Duffin, “On the characteristic matrices of covariant systems,” Physical 
Review, vol. 54 (1938), p. 1114. 

N. Jacobson and F. D. Jacobson, “ Structure and representation of semi-simple 
Jordan algebras,” to appear in the Transactions of the American Mathe- 
matical Society. 

P. Jordan, “Uber die Multiplikation quanten-mechanischer Grdéssen,” Zeitschrift 
fiir Physik, vol. 80 (1933), pp. 285-291. 

P. Jordan, J. v. Neumann and E. Wigner, “ On an algebraic generalization of the 
quantum mechanical formation,” Annals of Mathematics, vol. 35 (1934), 
pp. 29-64. 

N. Kemmer, “ Particle aspect of meson theory,” Proceedings of the Royal Society, 
vol. 173 (1939), pp. 91-116. 

, “The algebra of meson matrices,” Proceedings of the Cambridge Phi- 
losophical Society, vol. 39 (1943), pp. 189-196. 

H. A: Kramers, F. J. Belinfante and J. K. Lubanski, “ Uber freie Teilchen mit 
nicht verschwindender Masse und beliebiger Spinquantinzahl,” Physica 8, 
vol. 8 (1941), pp. 597-627. 

D. E. Littlewood, “ An equation of quantum mechanics,” Proceedings of the 
Cambridge Philosophical Society, vol. 43 (1947), pp. 406-413. 

J. K. Lubanski, “Sur la theorie des particules elementaire de spin quelconque,” 
I and II, Physica 9 (1942), pp. 310-324 and 325-338. 

N. Svartholm, “On the algebras of relativistic quantum theories,” Proceedings of 
the Royal Phisiographical Society of Lund, vol. 12 (1942), pp. 94-108. 

H. Weyl, “ Theorie der Darstellung kontinuierlicher halbeinfacher Gruppen durch 
lineare Transformationen,” I, II and III, Mathematische Zeitschrift, vols. 
23-24 (1925-1926), pp. 271-304, 328-376, 377-395. 


rr 


—_ 


m0 
1. 
2. 
3. 
4. 
5. 
6. 
i. 
8. 
9. 
10. 
11. 
12. 
13. 
14. 
= 
| 


A CLASS OF INVERSE LIMIT GROUPS.* 


By FRANKLIN HAIMo. 


1. Introduction. Groups G to be considered here are to be additive 
abelian. The paper is concerned with certain inverse limit groups? of sets 
of quotient groups of G. The relation of these limit groups to related vector 
groups and to certain quotient groups of G will be studied. If G has at least 
one compact topology, the inverse limit groups may take on particularly simple 
forms. Recently, MacLane* has apparently used a special case of the type of 
limit group to be developed here. 

Specifically, let 2 be a directed system‘ of subgroups of a fixed group 
G, where # is indexed by a directed set A, partially ordered by a relation <, 
in such a way that ee A, Be A, «<< B imply Ha~ Hg, and conversely. It 
follows that if H, and Hg are in &, there exists Hye M with HH, H,, 
Hy > H,. There then exists a natural homomorphism for each ordered pair 
of elements of A which are in the relation « < B, ¢a8 of G/Hg onto G/Ha, 
with kernel isomorphic to H,/Hg. This statement of the second isomorphism 
theorem, as applied to the groups of cosets of H, and Hg in G, is, in part, 
+ Hp + Ha, where H,> He, geG. lia<B< y, = da. 
If a < B, we write G/H, < G/Hg. The system of quotient groups {G/H,}, 
ae A, under < and with the ¢’s, forms an inverse limit group of a special 
sort. We denote it by invlim (G; 4). 


2. Canonical elements and compact groups. Let {g.-+ Ha} denote 
an element of invlim (G@;). Here, the gq are in G and are subject to the 
relations gy — gge whenever Hg  H,. Conversely, these relations imply 
the existence of an element of invlim (@;&%). Suppose that there exists 
ge such that g —gae H, for every ee A. Then Ha} = {g + Hy}. 
Such an element of invlim (G@; 8) is called canonical. If all the elements 


* Received April 16, 1948. Presented by title (of different wording) to the Society, 
April 16-17, 1948. 

1 Definitions and theorems concerning directed sets and inverse limit groups (in 
general) are to be found in S. Lefshetz, Algebraic Topology, New York, 1942. See 
especially pp. 4-5, 31-32, 54-56. 

2S. MacLane, “ The group of abelian group extensions,” (an abstract), Bulletin of 
the American Mathematical Society, vol. 54 (1948), p. 53. 

171 


172 FRANKLIN HAIMO. 


of invlim (@;&) are canonical, the inverse limit group, itself, is called 


canonical. Let the crosscut {] Hg be denoted by H. 
aeA 


THEOREM I. The set K(G;) of canonical elements of invlim (G; %) 
is a subgroup (the canonical subgroup) of invlim (@;&) and is isomorphic 
to G/H, where H =() Hg. 


aeA 
Proof. The subgroup property is obvious. Suppose that {g + Ha} isa 
canonical element. Define a homomorphism O of K(G;&) onto G/H by 
Ti (9+ Hat = {9 + Ha}, g—g eH. for every 
aeA so that g—g’ eH, and g+H=g’+H. The converse statement 
also holds so that O is unique and has the kernel (0). 


TueoreM II. Let O be a homomorphism of G for which kernO C H 
=[{)H,. Then invlim is canonical if, and only if, invlim (0(G) ; 


aeA 


O(H)) is likewise canonical. [Here, O(9f) is a system of subgroups of O(G), 
also directed by ~ and indexed by A, and consists of the maps under induced 
O of the elements of #.] 


Proof. Let {O(g,) + O(H,)} be an element of invlim (O(G@);0(#)). 
Then, if H, Hg, O(ga) — O(ggs) O( Ha). Hence, there exist antecedents 
of O(ga) and O(9g), ga and gg, respectively, such that ga—gge Ha, since 
the complete inverse image of O(H,) is Hz. Thus, {g,-+ Hy} is an element 
of invlim (@; 9). Suppose that invlim (G@; 9) is canonical. There exists g e G 
such that g — gy e H, for every ye A. It follows that O(g) — O(g,) € O(Hy). 
Therefore, {O(gy) + O(H,)} = {O(g) + O(Hy)}, so that invlim (O(@); 
O(#)) is canonical. 


Conversely, suppose that invlim (O(G);O0()) is canonical. Then, 
{ga + Ha} e invlim (G; 9%) implies that {O(ga) + O(Ha)} is in invlim 
(O(G);0(H)). There exists O(g) such that O(g) —O(ga)eO(Ha) for 
every ae A. Hence, g — gae Ha for every «, so that {ga + Ha} = {9 + Ha}; 
and invlim (G; #) is, consequently, canonical. 


THEOREM III: Let G possess at least one topology, making it a compact 
topological group in such a way that the members of a system of subgroups # 
(directed by ), are closed. Then invlim (@; &) ts canonical. 


Proof. Let an element of invlim (G; &#) be {ga-++ Ha}. Then the cosets 
(ga + Ha) are closed? as point-sets of G. Let Ha,, + +,Ha, be any 
finite subset of #. There exists, (since # is directed by ©), Hge &, 


3L. Pontrjagin, Topological Groups, Princeton, 1939. See p. 53. 


= 


A CLASS OF INVERSE LIMIT GROUPS. 173 


Hg C Ha, Consequently, any finite subset (ga, Ha,); 
i=1,2,--+,m, of cosets has a non-void intersection. By compactness, this 
must also be true for the crosscut of all the cosets (ga + Ha), «eA. Hence 
there exists g € (ga + Ha), for allae A. Therefore, {g. + Ha} = {9g + H,}; 
and invlim (G; &) is, accordingly, canonical. 

Let m and n be positive integers. If n is divisible by m, mG - n@. 
Using the notation * G/sG = Gs, we write Gm < Gn if, and only if, mG > nG@. 
It is well-known ° that if G is compact then m@ is closed for every positive 


integer m. 
Corottary. Let G’ be the set of completely divisible elements of G. 


(That is, @ =={| mG.) Let G have at least one topology making it a compact 


m=1 


topological group. Then invlim (G; {nG}) ts canonical and is isomorphic to 
G/G’. (Here # = {nG} is the set of all nG, n a positive integer, directed 
by ~.) 


Specializations of the groups used in the corollary have been studied 
by MacLane.’ 

Let * »G be the set of all ge G for which ng =0. In a compact group 
the »G@ are closed. It follows that invlim (G@;{.@}) is canonical and 
isomorphic to G. However, this result holds even if G cannot be compact 
and is, indeed, a trivial consequence of the following theorem. 


THEOREM IV. Let &’ be a subset of # which is cofinal therein. Let 
invlim (@; 9’) be canonical. Then invlim(G@;%) is canonical and is 


isomorphic to invlim (G; #’). 


Proof. This follows immediately from a well-known theorem on inverse 
limit groups in general.* 

We say that the system &, indexed by A, contains a “ last” element Hg 
under if Hge & implies Hg ~ Hag for every Be A. 


Corotuary. Jf 9% contains a last element under -, then invlim (G; #) 


is canonical. 


In the case of the ,@ above, ,G = (0) is the last member of the set 
under >. Hence, always, @ is isomorphic to invlim (G; {nG}), and the latter 
is canonical. Another such example is furnished by the set & of all sub- 


‘For some of the notation and for definitions see pp. 554-557 of P. Alexandroff and 
H. Hopf, Topologie I, Berlin, 1935. 


‘4 

? 

r 


174 FRANKLIN HAIMO. 


groups of a group G which include a fixed subgroup H of G. Here, 
invlim (G@; #) is canonical and is isomorphic to G/H. 

Let the system & be a refinement of the system #’. That is, 2’ 
is to be a subset of 9, and the partial ordering relation of the directed 
system #’ is to be extended to the partial ordering relation of the directed 
system &. Then there exists a homomorphism of invlim (@; 4%) into 
invlim (G@; #’), a projection. The canonical subgroup of the former maps 
onto the canonical subgroup of the latter under this homomorphism. 


3. A non-canonical inverse limit group. We shall exhibit the existence 
of non-canonical inverse limit groups by a simple example. Let p(1), p(2), 
- ++, be the enumeration of positive primes by increasing size. To each 
positive integer s > 1, there corresponds a sequence of non-negative integral 
exponents (almost all zero), m(1,s), m(2,s),° - +, such that 


s= II p(t) m(i,8) 


Let h(s), a non-negative integer, 0 = h(s) <s, be a solution of the simul- 
taneous congruences 


==i—1 mod p(i)™ 8), 


h(s) exists and is unique by the so-called Chinese remainder theorem.® 

Let J be the group of integers. invlim (J;{sZ}) has the element 
{h(s) + sl}. For, if ¢ and wu are positive integers greater than 1, then ¢ | u 
implies m(i,t) = m(i,u). It follows that 


h(w) =i—1 mod p(i)™?, 


h(t) =i—1 mod p(i)m?, 


Thus h(u) —h(t) =0 mod p(i)™@, ¢=1,2,--+, and this implies 
h(u) —h(t) etl. 

On the other hand, {h(s) + sZ} is not canonical. Otherwise, there would 
exist heI such that h=h(s) =i—1 mod p(t)™@8), 
s = 2,3,---. But there exists s >1, for which m(h,s) >0. For 
such s, h==h—1 mod p(h), and this is impossible. 

A trivial consequence of this example is the well-known fact that J, 


5§. Lefschetz, op. cit., p. 68. 
® See, for instance, p. 18 of C. C. MacDuffee, An Introduction to Abstract Algebra, 


New York, 1940. 


A CLASS OF INVERSE LIMIT GROUPS. 175 


the group of integers, can never be a compact topological group.’ For, if it 
were, invlim (J; {nZ}) would be canonical, by Theorem III, Corollary; and 
this is a contradiction. 

The group invlim (J; {nZ}) can be used to show that the result of 
Theorem IT need not hold if the restriction on kern O is set aside. For, in 
the case of a proper homomorphic image O(J) of J, invlim (O(Z) ; O({nI})) 
is canonical while invlim (J; {nJ}) is not. In fact, the kernel of no proper 


homomorphism of J can be included in [) nJ, since the latter is (0). 


n=1 
4, The division hull. The group invlim (@; %) is a subgroup of the 
cartesian product of all the G/Hg, ae A, the so-called strong direct sum or 
vector group of the G/H,: } @’G/H,. Here, the symbol }2@’B, stands 


aeA aeA 
for the cartesian product of the groups B, with component-wise addition. 


It differs from the direct sum }} @ B, in that elements with an infinite 


aeA 
number of non-zero components are allowed. We should like to know some- 


thing of the relation of the inverse limit group to the corresponding vector 
group. A few results are obtained here. 

Following Alexandroff-Hopf,*, 17°, the division hull of a subgroup H 
of G, is the set of all ge G such that there exists a non-zero integer n(g) with 
n(g):geH. A subgroup H for which H® = H is called a subgroup with 
division, (or H is said to have division in @). 


THEOREM V. Jf each Hae & is a subgroup with division in G, then 
invlim (G@;&) has division in the corresponding vector group. Let #& have 
no last element under ©; or let & have a last element which is the inter- 
section of all the other members of &. (In particular, the latter implies 
that & has more than one member.) Then the converse of the above state- 


ment holds. 


Proof. Suppose that each Hg has division in G. Let z= {ga + Ha} 
be an element of © @’G/Ha, where the G/H, are treated as groups of cosets. 
Suppose that nzeinvlim Then implies n(ga—gp) © Ha. 
It follows that ga—ggeH,. Therefore, zeinvlim (G; 4). 


Conversely, let invlim (@; &) have division in the vector group. Suppose 
that ge G, nge Ha, where n is a non-zero integer. Construct z in the vector 


* For such questions, treated from a different point of view, see “ Preservation of 
divisibility in quotient groups,” Duke Mathematical Journal, vol. 15 (1948), pp. 347- 
356. Cf., A. Kurosh, “ Primitive torsionfreie abelsche Gruppen vom endliche Range,” 
Annals of Mathematics (2), vol. 38 (1937), pp. 175-203. 


176 FRANKLIN HAIMO. 


group such that the component of z in G/Hg is 0 + Hg if Hg -D Hg and is 
g + Hg if Hg Hg. nz is obviously the zero element of the vector group 
so that nz is in the inverse limit group. Therefore, ze invlim (@; 4). 

Case I. There exists H,y~ Hg, Hz H,. Then 0+ H, maps back 
onto g+ This implies ge Hg. Case Il. Hp, Then 
ng € Hg, and by the method of proof for case I, ge Hg for every such Be A. 
Hence, ge Hg. 


Corotiary. Let invlim (G@; #) be with division in ¥ @’G/Ha. Then 
every non-last (under DO) Hg is with division in G. 


The following example shows that Theorem V does not continue to hold 
if the restrictions on & are dropped. Let # have two members, J and 21, 
where J is the group of integers, and G=J. [ has division in J, 27 does not, 
and [0 2I. The group invlim (J; #) is associated with the vector group 
(0) where J, = I/2I. Let n be a non-integer, xe (0), ye Zz, and suppose 
that n(x,y) is an element of invlim (J; 9), where (2,y) is an element of 
the vector group. Let 1. be the only non-zero element of J, 20. But 
both (0,0) and (0,1.), the only possibilities for (x,y), are in invlim (J; #) 
so that the latter has division in the vector group even though 27 does not 
have division in J. 

TuHeEoreM VI. Let [invlim(G; 9)]° be the division hull of invlim(G; 4) 
in the associated vector group. Let H,° be the division hull of Hg in G. 
Then there exists ahomomorphism of [invlim(G; 9) ]° into invlim(G; {Ha°}), 
the kernel of which consists of all those {ga + Ha} of [invlim (G; 9) ]° for 
which gae for every ae A. 


Proof. If {ga + H,}e [invlim (G; #)]°, then there exists a positive 
integer m such that m(gg— gy) € Hg whenever B <y. Hence, gg — gye He’, 
and {ga +H,°} is in invlim (G@; {71,°}). The remaining details of the proof 
are trivial. 

We note that the elements of [invlim (@;{n@})]° are “almost” in 
the form of inverse limit elements. For, if m{gn + nG@} e invlim (G; {nG@}), 
where g, G, then m(gs—g:) esG@ whenever sG@ 0tG; and if (m,s) =1, 
Js —gresG. 

Let H, a subgroup of G, have the property that G/H is a torsion group.’ 
Then, to each g e G, there exists a positive integer n(g) such that n(g)-geH, 
and conversely. This condition, also equivalent to the condition H°® = G, 
might be described by saying that H is a torsion generating subgroup of G. 


THEOREM VII. Let & have no last element under ; or, let the last 
element of & be the intersection of all the other members of #. Then tf 


i 
a 
i 
| 


A CLASS OF INVERSE LIMIT GROUPS. 


invlim (G; 9) be torsion generating in the corresponding vector group, each 
H, is torsion generating in G. 


Proof. For ge G and for «e A, form an element z of the vector group 
in such a way that the component of z in G/H, is g + Ha and in G/H,g is 
0+ Hg, for all B«. By hypothesis, there exists a positive integer n 
such that nzeinvlim (G; 4). 


Case I. There exists Be A such that Hg ~Ha, Ha Hg. Then 
0 + Hg maps back onto ng + Ha, so that nge Hg. Case Il. Ha=f) He, 
BA« ng+ Hag maps back onto 0+ Hg for all BA. It follows that 
nge = Ha. 


Jf invlim (@;H) is torsion generating in > @’ G/Ha, 
then each non-last H, is torsion generating in G. 


The converse of this theorem is not true, in general, even if we restrict 
# as in the theorem. For example, in the group } @’I/nI choose an 
element z with n-th component 0+ nJ for non-composite positive integers n 
and with n-th component n/C + nI for composite positive integers n, where 
C =C(n) is the product of all the first powers of the distinct positive prime 
divisors of n. If zeinvlim (J; {nJ}), 1+ 67 maps back onto 0+ 2J, so 
that 1e 27, a contradiction. Thus, the inverse limit group is proper in the 
vector group. Consider all positive integral multiples mz of z, where m > 1. 
To each such m, there exists a positive, composite, odd integer n such that 
(m,n) 1. Let p be the first prime smaller than any prime factor of n. 
Such a prime exists since n is odd. It follows that (p—1,n) —1 and that 
(p,n) == 1. Consider the components n/C(n) + nI and p?n/C(p?n) + 
of z. Suppose that for some m as defined above mzeinvlim(Z; {sZ}). Then 
mn/C(n) — mp?n/C(p*n) should be in nl. But C(p?n) = pC(n), so that 
the above expression reduces to mn(1—p)/C which is not divisible by n. 
This proves that the inverse limit group is not torsion generating in the 
vector group, even though each: sJ is torsion generating in J. Note that the 
system of subgroups # = {sI} has no last element under -. 

An example to show that the conclusion of Theorem VII need not hold 
if # has a last element is given by the group J @ I, the direct sum of two 
copies of the group J of integers. Let & consist of two subgroups, I @ I 
and (0) @®I. It is easy to show that the vector sum coincides with the 
inverse limit group so that the latter is torsion generating in the former. 
But (0) @ J is not torsion generating in J @ I. 


WASHINGTON UNIVERSITY, 
St. Louis, Mo. 


12 


177 
L 
| 

f 

t 
) 

t 

) 
), 
yr 

re 

of 
in 

); 
1, 

a 

1, 
G. 

st 

| 


ALMOST PERIODIC INVARIANT VECTOR SETS IN A METRIC 
VECTOR SPACE.* 


By HERMANN WEYL. 


1. The problem: basic notions and axioms. The situation underlying 
the theory of spherical harmonics deals with (complex-valued) functions f(P) 
on the sphere II under the influence of its rotations s. The rotations form a 
transitive group o. The spherical harmonics of order 7 form a linear set of 
such functions (of dimensionality 27 + 1) that is invariant under all rotations. 
Here evidently the fact is used that functions f(P) can be added and multiplied 
by numbers, in other words, that they form a vector space (of infinitely many 
dimensions). If the rotation s carries the point P into sP the transform 
f =sf of the function f is defined by f’(sP) =f(P), or sf(P) =f(s"P). 
It is obvious how to generalize this situation: 

IT. Given a group o and a vector space &. Any two vectors can be 
added and a vector can be multiplied by a number; for these operations 
the well-known axioms of vector geometry hold. Moreover there is 
associated with each element s of o a linear transformation f-— f’ = sf 
of the vector space, 

s(f: + fe) = (shi) + (sfe), 8(af) sf, 
so that 
 t(sf) = (ts)f 
(a being any number, f, f:, f2 any vectors, s, ¢ any group elements and | 
denoting the unit element of oc). 
Definition. h linearly independent vectors g:,- ++, 9n, or their linear 


combinations +: by arbitrary numerical coefficients éu, 
form an invariant set T if ge T implies sgeT for every s in the group: 


+: is changed by s into sg = &89, +: where 


Eu? = opr (s) Ev. 
v 


* Received May 11, 1948. 


178 


ALMOST PERIODIC INVARIANT VECTOR SETS. 


s—>Q(s) = || opr(s) || is a matric representation Q of the group o of degree h. 
Change of the basis of T changes © into an equivalent representation. We 
speak of an invariant set of degree h and order Q. The set is called primitive 
if Q is an irreducible representation. 

The salient point in the theory of spherical harmonics is the fact that 
they form a complete orthogonal system. Clearly this fact depends on the 
possibility to form a scalar product (g,f) of two functions f, g (namely by 
integrating g(P)-f(P) over the sphere by means of an area element dwp 
that is invariant under rotations). Hence we add to I the following 
assumptions: 


II. % is a metric vector space; i.e. with any two vectors f, g there 
is associated a number (g,f) as their scalar product, that is linear in the 
factor f, co-linear in the factor g: 

= (9-f:) + (9; fe) (9. + 92 f) = (gu f) + (G92 f) 
(9, af) =a- (9. f) (a9, f) =%- (9,f). 
It is of Hermitian symmetry, (f,9) =(9,f), and |f||?=(f,f) is 


positive except for f= 0. The linear transformations (s) in 3, f— sf, 
associated with the elements s of o, are isometric, (sg, sf) = (g, f). 


The equation | af || —|«|-||f || and the inequalities 
I 


are an immediate consequence of these assumptions characteristic for a 
“unitary metric.” More generally, for any numbers & and vectors fi, gi 


(1=1,-:- -,m) one has the relations 
(1.1) 


which are proved in the same manner as the analogous Cauchy-Schwarz 
inequality for numbers. 

We can now choose the basis gn (u—1,:--+,h) of a given invariant 
set T as a unitary one, (gu, gv) = Sy. Then Q(s) is a unitary matrix and 
Q a unitary representation. Every vector f may be split into a vector 
lying in and one f’ that is perpendicular to I, 


by choosing as the Fourier coefficients of f, (gu, f). Then 


179 
| 
| 


180 HERMANN WEYL. 


In the Parseval equation, the proof of which is our goal, only invariant 
sets of such functions will occur as can be prepared by linear combination 
from the transforms sf of f. But, of course, finite combinations alone will 
not do; we need some property of closure by which to pass from finite sums 
to their limits (integrals or mean values). But how shall we measure the 
degree to which a vector difference g approaches zero? If || g || is used as 
this measure, then the continuous functions on the sphere will not form a 
closed vector space, one would have to include all Lebesgue-square-integrable 
functions. But it seems foolish to operate in such a wide field when all 
functions arising in the course of our construction are continuous. There- 
fore we introduce axiomatically the length | g| of an arbitrary vector g as a 
quantity that does not necessarily coincide with the modulus || g || and define 


closure in terms of this new notion. 


III. Length and Closure. Let there be associated with every 
vector f a non-negative number |f |, the length of f, which is zero for 
f =0 and satisfies the conditions 


(1. 3) If 
The transform sf is supposed to have the same length as f, | sf| =| f |. 
Any sequence f,,f2,--- of vectors is said to converge strongly if 


|fn—fm|—-0 for n,m— oo. We require that any strongly convergent 
sequence f, converges strongly toward a definite vector f, | fn—f|—0 


for n—> 


It ought to be observed that, whether the space & is closed in this sense 
or not, it can always be extended into a closed space without violating any 
of the previous axioms. This is done by Cantor’s classical procedure as 
follows: Any strongly convergent sequence f, is said to define an ideal 
vector f*; and two such sequences fn, gn are said to define the same f* if 
|fn—g9n|—0. Addition, multiplication by numbers, the definitions of 
(9,f), |f| and of the transformation f— sf readily extend to these ideal 
vectors. In this way the axiom of closure can be enforced if it is not assumed. 

We call a sequence fn weakly convergent if || fn — fm || 0 for n,m— o. 
The axioms about length are certainly fulfilled if we define |f| as || f ||. 
Strong-convergence-closure then becomes weak-convergence-closure. 

Instead of = we may use as our field of operation any closed linear 
subspace 3; of & that is invariant with respect to all the transformations 
(s), g—>sg, and contains the given vector f. The “smallest” of these 


ALMOST PERIODIC INVARIANT VECTOR SETS. 181 


subspaces, &,°, consists of the strong limits of strongly convergent sequences, 
the members of which are finite linear combinations > é- sif of transforms 


4 
sf of f. Our aim is to construct in 3;° a (finite or infinite) unitary sequence 
of vectors 


gn (n=1,2,- °°), (Gs Jt) = 8x1, 
consisting of a string of invariant sets 


of such completeness that the Parseval equation 


holds for the Fourier coefficients an —=(gn,f) of f. This means that the 
partial sums of the Fourier series } %ngn converge weakly to f; for 


A further axiom (Axiom IV in 4) will be needed in order to ensure that f 
can also strongly be approximated with arbitrary accuracy by finite sums 


k 


No restriction shall be imposed on the group o. However the proof of 
the fundamental theorem depends on an essential assumption concerning the 
vector f, an assumption to which for historical reasons the name “ almost 
periodic ” (abbreviated: a.p.) has been given. Using |sf—tf]| as a sort 
of distance between two elements s, ¢ of the group, let us say that s lies in 
the f-circle of radius « around a if |sf—af|S«. Any two elements s, ¢ 
in this circle satisfy the relation |sf—tf|< 2. The vector f is called 
almost periodic if the group is f-compact, i.e. if it may be covered by a finite 
number of f-circles of arbitrary small radius «. 

This definition may be replaced by the following criterion: f 1s a. p. 
if the group can be covered by a finite number of sets o; (“ roof-tiles”’) 
such that | sf—tf|<.« for any two elements s, ¢ in the same set oj. The 
o; are then said to form an (f,«)-tiling. Indeed a finite number of f-circles 
of radius « covering the whole group form an (f, 2e)-tiling. Vice versa, 
if the o; form an (f,«)-tiling, we choose a ” a; in each of 
the non-empty o;. The f-circles of radius « around these 1cpresentatives 
cover the group. Let f’, f” be two a.p. vectors, oj (i=—1,:--,m) an 
(f’, «)-tiling and an (f’,e)-tiling. Form the n-m 


“ representative 


t 
8 
e 
l 
e 
r 
|. 
f 
0) 
e 
y 
f 
il 
1. 
1S 


182 HERMANN WEYL. 


intersections oj 0%; For any two elements s,¢ in the same oy; 
the inequalities 


hold simultaneously. In this sense the pair f’, f” is a. p. if each of the two 
members is. This fact carries over to any finite number of a. p. vectors 
fis fo, + +, fi. Hence not only the product of an a. p. vector by a number a, 


but also the sum of two a. p. vectors, and thus any linear combination > &f; 
i 


of a. p. vectors is a.p. Moreover the strong limit of a strongly convergent 
sequence of a. p. vectors is a.p. Finally the transform af of an a. p. vector f 
by an element a of o is a.p. Indeed let a,,--+,a@, be the centers of 
f-circles of radius « covering the whole group, and set a*;—aja'. Then, 
for any given s, at least one of the inequalities 


| saf —a*,af | S«,---,| saf—a*,af | Se 


holds. We thus see that with f all the véctors of the space X/° are a. p. 

A numerical function ¢(s) is f-continuous if for every 6 > 0 there exists 
a positive « such that | —¢(t¢)| whenever |sf—tf|Se«. For a 
vector function g(s), whose values are vectors, the length | g(s) —g(t)| 
has to replace the absolute value | ¢(s) —¢(¢)| in this definition. The 
chief tool of the theory is the existence of a uniquely determined mean 
value J= f(s) for every f-continuous function ¢(s). A number of 
elements * and corresponding weights (‘=1,:--,n) of 
sum 1 determine a (weighted) average 


A = = & 
Let us say that this average oscillates by less than e« if 
|S a:-d(aa;) —A| Se 


for all elements a. It can be proved [Maak 6; see also the Appendix of this 
paper] that there are such averages for any given positive ¢«, and that an 
average A, that oscillates by less than e and an average Bs that oscillates by 
less than 6 differ by not more than ¢« + 4, 


(1. 4) 


Hence there exists a definite number J, the mean f¢(s) of ¢, such that 
|Ae—J |e for every average A, that oscillates by less than ¢« and every 


n 
t 
( 


ALMOST PERIODIC INVARIANT VECTOR SETS. 183 


e>0. By its very definition the function ¢(as) has the same mean J as 
¢(s) for any fixed elements a of o. It can be shown that the same is true 
for ¢(sa). All this carries over to an f-continuous vector function g(s) of s. 

The axioms I-III remain valid if o is replaced by any subgroup o° of o. 
A vector f that is almost periodic with respect to o is also a.p. with respect 
to o°. Indeed if a finite number of subsets o; of o form an (f, ¢)-tiling 
on o then the sets o° {} o; do so on o°. 

Next we describe in precise terms that special interpretation of our 
axioms, to the simplest case of which the opening paragraph alluded. 


I,. Given a group o and a point field IZ; moreover a realization 
of o by transformations of II. In other words, with every element s 
of o there is associated a mapping {s}: P— P’ =sP of II upon itself 
such that |P =P, t(sP) = (ts)P. The group of these transforma- 
tions is supposed to be transitive, i.e. given any two points P and Q 
there exists an element a of « such that Q =aP. The transform sf of a 
numerical function f=f(P) is defined by sf(P) =f(s"P). 


An element s is said to lie in the f-circle of radius « around a if 
| sf(P) —af(P)| Se identically in P. The function f(P) is called almost 
periodic if the group can be covered by a finite number of f-circles of 
arbitrarily small radius «. Choose a point P>. This requirement obviously 
implies that sf(Po) =f(sPo) is a bounded numerical function of s, and 
hence, because of the transitivity of the group of transformations, f(P) is a 
bounded function on II. Looking upon the function f(P) as a vector f in 
the functional space & of all a.p. functions on II, we define |f| as the 
lowest upper bound (l.u.b.) of | f(P)|, with the effect that our definition 
of almost-periodicity for the function f(P) now coincides with that for the 
vector f as given above. Axioms I and III are satisfied in % with the 
_ (possible) exclusion of the relation (1.3). 

It remains to define the scalar product. For any given point P the 
numerical function y(s) =sf(P) is evidently f-continuous, and we may 


form the average f{y(s) = = Since fy(as) = f(s), 

i.e. = Si(s rr) five Q this mean has the same value for 

wil for 1p and is thus a constant J. Write J = f{f(Q) and call J 
P 


the mean value of the a.p. function f(P). The relation fy(sa) = f(s), 
on the other hand, shows that the transform af has the same mean as f. 
(It is only at this one place where the relation fy(sa) = f(s) comes into 


184 HERMANN WEYL. 


play.) The pair f(P?), g(P) of two a. p. functions is a. p., and hence also 

the product g(P)-f(P) of the conjugate of g by f. We now define the 

scalar product (g,f) as the integral fg(P)-f(P) and then find that the 
P 


axioms IJ and the relation (1.3) are true in &. (In addition all the vectors 
of this space are almost periodic. But this is no serious restriction. For 
all our axioms carry over from a space & to the subspace of its almost 
periodic vectors.) We refer to this special interpretation as the “ interpre- 
tation by scalar functions on II.” 

It can be further specialized by identifying II with o and associating 
with the element a of o the left translation {a}: sas of Io. The 
left translations constitute a transitive group of transformations on Il =o 
which is isomorphic to o. This “interpretation by scalar functions on oa” is 
used in the construction of a complete set of inequivalent irreducible unitary 
representations of o. 

For compact Lie groups the theory of continuous representations was 
developed by F. Peter and the author [8] in 1926, and it was at once realized 
[10] that the method, I shall call it the integral equation method, carries 
over to the group of translations of a straight line and thus affords a natural 
approach to H. Bohr’s theory of almost periodic functions [2]. It was 
J. von Neumann [7] who discovered that mean values can be defined for 
almost periodic functions on any group, and thus the construction of a 
complete set of a. p. representations was extended to arbitrary groups. In 
defining the mean value and deriving its essential properties we follow here a 
simplified procedure due to W. Maak [6]. The interpretation by scalar 
functions on II (for compact Lie groups) was given by E. Cartan [8] and 
the author [11]. Results concerning unitary group representations in Hilbert 
space were obtained by S. Bochner and J. von Neumann [1] and by Anna 
Hurewitsch [5]. But the straightforward application of the integral equation 
method to the general situation staked out by our axioms seems to yield _ 
more precise and complete information—No new methodological ideas will 
be developed in this paper; its purpose is to circumscribe the conditions 
under which a known method works. 


_ 2 gives the construction of the complete string of invariant sets in >/° 
and thus derives the Parseval equation (weak approximation of f). The 
result is more fully evaluated in 3, while 4 proves the central theorem con- 
cerning strong approximations, and adds a further interpretation in terms 
of “vector functions on II.” The Appendix (5) contains some remarks 
about Maak’s procedure and the combinatorial lemma on which it is based. 


| 


ALMOST PERIODIC INVARIANT VECTOR SETS. 185 


2. The construction. From now on everything is relative to a given 


a. p. vector f so normalized that || f ||? <1. 
The Cauchy-Schwarz inequality will be used by us in three different 


forms: 

(C;) | f a(s)é(s) |? Sf | f 

(C2) If &(s)-g(s)lP Sf 
(C3) lf (9'(s),9(s)) [7S f (s) 


E(s), n(s) are f-continuous numerical functions, g(s), g’(s) are f-continuous 
vector functions. (C,) arises from Cauchy’s inequality for numbers by 
passing from finite sums to the limit of integrals, (C.) and (C3) stand in 
the same relation to the inequalities (1.1), (1.2). 

The given vector f defines the following linear mapping f of the space = 
of f-continuous functions €(s) upon the vector space %, 


ft: &(s) f &(s) 


The image g = fé lies in 3,/°. On the other hand, consider the linear mapping 
f* of the vector space = upon the space = defined by 

g—>é(s) = (sf, 9). 
[é(s) is not only f-continuous, but even satisfies a strong Lipschitz condition 
(2.1) | é(s) — €(t)| const. || sf — tf 
If f 4(s)-é(s) is taken as the scalar product (»,¢) in %, the operator {* 
is the Hermitian conjugate of f. Indeed ¢ = f*g’ gives &(s) = (g’, sf) and 


= f = (9 sf) E(s) = (99) 
where g = f &(s) - sf, hence (f*g’, €) = (g’, f€). Form the operator f*f in =, 
Sf (sf, tf) § H(s, t) E(t), 
t t 


with the Hermitian f-continuous kernel H(s,¢t) = (sf, ¢f). This kernel is 
positive definite in the sense that 


ff E(s) H(s, t) -€(t) 20 


8 8 8 
st 


186 HERMANN WEYL. 


for every function €(s) in =. Indeed the left side has the value 
Because of its invariance (under left translations), 


(2.2) H (as, at) = II (s, t), 
the kernel H/(s,¢) actually depends on the one argument ¢-'s only. Note that 
| Z(s,t)| Sl 


The integral equation method consists in determining the (reciprocal) 
positive eigenvalues y, y,,- - + of this kernel by E. Schmidt’s procedure and 


thereby proving the equation 


tr(H) = f 


The iterations of the kernel H are formed according to 


H,(s,t) = H(s, ¢), (s,¢) = r) H(r, t) (n =1,2,---). 


Set T, =tr(H,). The largest positive eigenvalue y is constructed as the 
limit of the quotient T,,,,/!, for n— «, and the corresponding eigenkernel 


as the limit of H,(s,t)/y" for n> ©. ¢,(s),° +, ¢n(s) form a unitary 
basis for the eigenfunctions of H(s,¢) that belong to the eigenvalue y. 
In this construction the invariance property (2.2) plays no réle. However 
the latter is decisive for the fact that the h functions ¢u(s) and the corre- 
sponding vectors gu = constitute invariant sets. 

In carrying out this program one has first to establish the inequalities 


(2 4) Tn, (2. 5) = 


We 


(mn=1, OSI <n), and this is the only point where the original 


exposition requires a slight new touch. Set 
fo(s) = sf, fn(s) = Hn(s, t) tf (n= 
t 
Then 
(2.6) Hn(s,t)Hn(t,s) = f f | Hn(s,t)|*?=0 (n=1,2,- +) 
st 


s t 


and 


8 
I 


ALMOST PERIODIC INVARIANT VECTOR SETS. 187 
(2.7) = ff || fn(s) = 0 (n=0,1,- -°). 


The second relation follows from the general equation 
(2. 8) (fm(S). fn(t)) = Hinsnsi (8, (n,m = 0). 


Hence all n=O (n=1,2,---). In proving (2.4) we distinguish the 
three cases (i) m, n even; (ii) m even, n odd; (iii) m, n odd. 


(i). Hmin(s,t) = Hn(s,7)Hn(1, t) yields by means of the inequal- 


ity (C:): 
| Ainan (s, t) 


f | Hn(s,r)|*- f | 
r 


and hence by integrating over (s,¢) and using the definition (2.6): 


Tom + Ton. 
(ii) fmin(s) = f Hm(s,t)fn(t) gives by means of (C.): 
t 
and thus in view of (2.6), (2.7), after integration over s, 
= Tom * 
(iii) From (2.8) there follows 
| Hmansr(s, |? S |] fm(s) fa (é) 
and then by integration over s and ¢, 


In (2.5) we distinguish the two cases (i) n+ 1 even; (ii) n+/ odd. 
Set in the first case n —] = 2u, n +1] = 2v, in the second n—1 = 2u + 1, 
ntl=Ww+l. 
(i) (8, s) — Sf H,(s, t) Hy (t, s), 
8 8 t 
hence application of (C,) to integrals over the pair (s,¢) gives the desired 


result 
Sf | ff | Holt 8)|*— 
8 ¢ 


t 


= 


188 HERMANN WEYL. 


(ii) Apply (Cz) to 
Tn f (s, s) f (fu(s); fr(s)) 
with the result 
[Tn] fu(s) fll fo(s) ? = Paves. 


Only the case 1] 1 of (2.5) will be used, 
(2. 9) Ty? S (n == 2, -). 
If f~0 then =| f >0 and also f f |(sf, ¢f)|*>0 [for 
et 


otherwise (sf,¢f) for all s and ¢, in particular (f,f) The 
inequality (2.9) then shows that one after the other of the traces T;,T,,° ° - 
are likewise positive and not zero. More precisely we find that qn =Tns/Ta 
(n =1,2,-- -) is an increasing sequence, while (2.4) shows that 


(2. 10) Qn™ * = = Dns 


or that the sequence g, never grows beyond [I,,’/”", in particular not beyond 
r,<1. Hence it converges to a positive number y. Note in particular the 
inequality q; Sy or 

(2. 11) T, < 


It is even easy to make an explicit estimate of the rapidity of convergence. 
Considering that gq, does not grow beyond and 
<= qn" one finds that gn from n =m on can not grow by more than 


qm (1 — S 1 — S —1)/m S (qr*—1)/m. 


T,/y" decreases with increasing n since gn =Tnsi/T'n Sy, hence tends to a 
limit h. But as (2.10) in the limit for n— o gives y"<T wm, the limit / 
of Tin/y™ for m— co is by necessity greater than or equal to 1. (As A will 
turn out to be the multiplicity of the eigenvalue y, an explicit estimate can 
not be expected for this second step!) Application of (C,) to 


Hms2(8; t) t) ff H(s, r) 5 r’) Al’, ”) H(r’, t) 


yields 


t) t) 1 Tom Ty, 


m+2 n+2 
Y 


Y 


ALMOST PERIODIC INVARIANT VECTOR SETS. 189 


and thus the uniform convergence of H»(s,t)/y" to an f-continuous limit 
E(s,t). Invariance, (2.2), carries over from H to EF, E(as, at) = E(s,t). 
From the equations 


H(s,r)E (1, t) = f H(s,r)H (1, t) = E(s, t) ; 
r 
f E(s,r)E(r, t), E(s,t) = E(t,s); tr(#) =h 
there follows by a well-known elementary argument such a relation as (2.3) 


in which the ¢u(s) form a unitary-orthogonal system of f-continuous eigen- 


functions of H, 
(2. 12) t)du(t) du(s), 
h turns out to be a positive integer. The one equation (2.12) may be split 
into two, 
(2. 13) sf, bu(s) = (sf, gu), 
by using the first as definition of the vector gz. The gu are unitary-orthogonal, 
(Jus gv) = Spy 
ff* = % is a linear operator which carries the arbitrary vector g into 
= S (sf, 9) sf. 
8 
This operator is invariant in the sense that 
(2. 14) (ag) =a(g) for any group element a. 
Indeed 
a(3g) = (sf, 9) -asf= (asf, ag) asf = (sh, ag) «sf = (ag). 
8 8 
The corresponding Hermitian form depending on two arbitrary vectors g, g’ is 
(9, 39) = (9 8f) 9). 
8 


(9’, 9) is conjugate to (g,%g’). By combining the two equations (2. 13) 
in inverse order one sees that the vectors gy obtained by our construction are 
eigenvectors of the operator % for the eigenvalue y, 


(2. 15) OIu= 


190 HERMANN WEYL. 


By starting with the equation 
ou(s) = t) du(t) 

and utilizing the invariance property of Z one finds that 

u(a*s) f E (as, t) = f B(a%s, a4) 

t t 
is a linear combination wvy(a) dv(s) of the and thus 
8 

or ~ wvp(a) gv. This equation 
(2. 16) $9u—= Dovn(s) Jv 


shows that the gy, form an invariant set {g,} belonging to the (necessarily 
unitary) representation Q: s—>Q(s) = || wpr(s)||. Let us say that h vectors gp 
(u=1,---,h) transform according to Q and call +, 9a) briefly an 
Q-row, if the equations (2.16) hold for every group element s. For any 


such row we infer from (2.16): 
(2.17) (sf In) = (f, 87° = gv) 


(f, gv) is the conjugate of the Fourier coefficient a — (gv,f). Moreover 
the matrix || wpv(s*) || is reciprocal to || way(s)!| and the latter is unitary, 
hence = Gpv(s), whereby (2.17) turns into 


(2. 18) (sf, Ju) = Gyv(s) 
For two Q-rows g, g’ one gets ; 

2 (9'u Sf) (sf, = - 
and hence by integration over s: 
(2. 19) > S9In) = ~ 


The special set {g,} constructed above satisfies thé relation (2.15), thus 
(9u, &9n) =y, and (2.19) yields the important equation 


(2. 20) hy = > OnGp. 


ALMOST PERIODIC INVARIANT VECTOR SETS. 191 


Let us say that the invariant set {g,} occurs in f with the (non-vanishing) 
weight | a | ?. 
When we subtract 


h 


from f, the remainder f’ = f—e is orthogonal to the vectors guy. If it is 
still different from zero, we repeat for f’ the process carried out before on f. 
One gets a new eigenvalue y’ and again a corresponding set of eigenfunctions 
=1,:--+,h’) and vectors The ¢’y(s) turn out to be 
orthogonal to the ¢u(s), the g’w to the gy. The equation T, = hy" + I’. 
shows that I’n/y" hence y’ < y. 

It is obvious how to continue the process. It yields a string of orthogonal 
functions ¢n(s) and orthogonal vectors gn (n—1,2,-- +) subdivided into 
sections. The section p, m1<nSNp—MNp1+hy, is characterized by a 
positive eigenvalue yp, of multiplicity hp; the vectors gn of this section form 
an invariant set and are connected with their partners ¢n(s) by the relations 


Gn= f on(s) bn(s) = (Sf, gn) << 
(p = 0, *5 hy =h, n_, = 0.) 


Form the Fourier coefficients %, == (gn, f) of f and the remainder 


nNp-1 


=f—(etat:: ++ =f 2 


Should it ever happen that one of the successive remainders f, f’,- °° 
vanishes, then the process comes to a stop. But since our final result is 
trivial in that case, we suppose in our argument that this never comes to pass. 


yp tends to zero, because 


ae |? + lan + SI 
implies 
hy hp) = = 1. 
The kernel 
(s,t) = (sf), tf) = (sf, tf ) 


depends on ¢-1s only, 
H)(s,t) =H® (ts), H®(s) = (sf, f). 


Application of the relation (2.11) to f@) instead of f gives the inequality 


192 HERMANN WEYL. 


=tr(H) =H (1) = | f and = | 


and thus 


(2. 21) /n. 


By means of the equi-continuity of H) (s) = (sf, f™) for all p we deduce 
from this estimate for an estimate for = || || ? = as follows. 
Choose a positive number e and ascertain a finite number N, of elements a; 
(i=1,---,N,) such that the f-circles C; of radius « around these elements 
a; cover the whole group. We propose to show that 8) 2e as soon as 
Np = N,/e?. Indeed the contrary hypothesis 2e < 8, will lead to np < N,/e. 
For an element s in the f-circle of radius « around | the following inequalities 
prevail : 
Isf—fl<|sf—fl<e 


hence 


and finally, since H‘?)(1) = 8,? and by hypothesis 8) > 2¢, 
| (s) | = $p(8p— €) > dy. 


Thus | |? > €°8,? for se Cj, and consequently 


Ne 
> | H) | ? > 
ia 


everywhere on o. By integration over s one finds 
N° sf | H)(s)| or (?) > 
8 
Combination with (2.21), n,T.) <,?, leads to the promised conclusion 
Ny < N,/e*?. But 
Hence the result can be stated as follows: 


Given any « > 0, the inequality 


(2. 22) (0S) If Sl a 


will hold as soon as n= N,/e’. 
This not only proves the Parseval equation 


(2. 23) |o.|?+ | a 


tk 


be 
Al 
va 


| 
( 
1 
1 
Y 
C 
| 


ALMOST PERIODIC INVARIANT VECTOR SETS. 193 


or the fact that the Fourier series #9, + ag2-+- - - converges weakly to f, 
but also provides us with an explicit estimate for the remainder under the 
assumption that the invariant sets are arranged according to descending 
weights, y >yi 
From f one easily passes to an arbitrary finite linear combination > ), - sif 
i 


of transforms of f and any vector g that is the weak limit of a sequence of 
such combinations. It results that we can approximate every such vector @ 


n-1 


by a finite sum v, => 6’xgx in the weak sense, || g—vn|| Se, with any 
preassigned accuracy «. But the best approximation in the weak sense by a 
sum Un, of a given number n—1 of terms is obtained by choosing f’; as the 
Fourier coefficients B; = (gx,g) of g. Hence the Fourier series } Bngn of 
any g of the type just described, in particular of any vector g in %;°, 
converges weakly towards g. In other words, the Parseval equation extends 
from f to any vector g in 3;° without a change in the unitary sequence 


91, 92," constructed from f. 


3. Evaluation of the result. Of the arbitrariness involved in the choice 
of a unitary basis gu (wu —1,:-+,h) in a given invariant set we can make 
use in such a way that the set breaks up into a number of mutually 
orthogonal primitive sets. Let us do that with each of the sets obtained 
by our construction! Suppose, for instance, that the first set consists of 
h==8 vectors and breaks up into two invariant sets of 5 and 3 members 
respectively. Since the equation (2.15) holds for each vector gy of the total 
set, the relation (2.20) will persist for the two partial sets, 


8 
> | % | ? = 5y, | % | = 3y. 
The primitive sets are therefore still arranged according to falling weights, 
y=2y1=:.:-, however the equality sign can not now be excluded. In the 
case just mentioned y, would equal y and 8y appear as dy + 3y1. 
A stage has now been reached where passage from the constructive to 


the existential standpoint becomes feasible. Let 
Q: s—>A(s) = | ow(s)| 


be a given irreducible unitary representation of the group o of degree h. 
Any representation ’ that is equivalent to 2, Q’ ~Q, is also unitary-equi- 
valent to it. [Indeed if the non-singular square matrix A satisfies the 


13 


194 HERMANN WEYL. 


condition Q(s)A = AQ’(s) one sees that AA* commutes with Q(s), and 
hence by Schur’s lemma AA* = pL, p> 0. Here A* is the conjugate of the 
transpose of A, and £ denotes the unit matrix. A/Vp is unitary.] Thus 
we may see to it that, whenever a primitive invariant set obtained in the 
course of our construction belongs to a representation ~Q, its orthogonal 
basis 9, transforms according to itself. Schur’s lemma further 
teaches the following two things: (1) If the gy and the g’y transform 
according to two inequivalent irreducible unitary representations, then they 
are mutually orthogonal, 


(gu: gv) =0 v—1,:--,h’). 
(2) If they transform according to the same irreducible unitary Q, then 


with a factor B independent of » and »v. Any numerical multiple 
Ag = (Agi,° °°, Agn) of an Q-row g is an Q-row, and so is the sum of two 
such rows. Thus the Q-rows form a linear manifold. It is natural to 
introduce the factor B in the equation (3.1) as the scalar product (g, g’) 
of the two Q-rows g and g’; 


Indeed it has all the formal properties of a scalar product, in particular 
(g,g) >0 unless g—0. Call the row a= (%,---,%) -of the conjugates 
% = (f, gn) of the Fourier coefficients the f-component of the Q-row g and 
define 

(a’, a) a | = (a, a). 


Aa and a+ a’ are the f-components of Ag and g+ 9’. The equation (2.19) 
now reads 


(3. 2) 38) (a’,a). 


An Q-row will be said to be hidden or flat if its f-component a is zero, 
and upright if it is perpendicular to all hidden Q-rows. Clearly there can 
not be more than h linearly independent upright Q-rows. For let 
g™,--+-,g™ be any m such rows and form the linear combination 
+--+ + Ang“. Whenever its f-component A,a™ +--+ + Ana” 
vanishes, g itself must be zero, because such a g is at the same time upright 
and flat, therefore (g,g) =—0, g =0. 

( 2.14) shows that the operator § changes an Q-row g into an Q-row $g; 


| 


ar 
es 


0, 
et 
yn 


n) 


ht 


ALMOST PERIODIC INVARIANT VECTOR SETS. 195 


according to (3.2) this $g is necessarily upright. Any Q-row g = {gp} 
obtained by our construction satisfies an equation (2.15) with a positive 
factor y, Ju = and consequently these Q-rows themselves are upright: 
among all possible Q-rows the construction automatically selects the upright 
ones as those that actually occur in the Parseval equation (2.23). Such a 
principle of selection is needed, since the linear manifold of all Q-rows may 
not be of finite nor even of denumerable dimensionality. 

Let (m<h) be an orthogonal basis, g™) = 84, 
for the upright Q-rows. Then their f-components a™,---,a are also 
linearly independent. Moreover we have 


(gn, )= (1, h) 
and 


where the coefficients 

(8%, h(a", a) 
form a Hermitian matrix. (g,%g) for the arbitrary upright Q-row 
g=Déig' is the positive-definite Hermitian form 

i 
(3. 3) GLE] = Dd = | || ? 
isk i 
of the m variables é;. One can therefore ascertain m Q-rows g‘) such that 
=> Then 
k 


Bul (8) where Fu! (s) = (sf, Gal). 
$ 
[In passing it may be observed that if m has the highest possible value h, 
then the functions wpv(s) will be f-continuous; for then one can express 
them as linear combinations of the f-continuous functions (gy, sf) by 
inverting the relations (2.18), 


(gu, sf) =X opr(s) 
The orthogonal basis g‘*) could be so chosen that G[€] is on principal axes, 
yxs = 0 for k 1, yi=yi> 


Then ¥gp') = yi: gu’. It is this normalized form which results from the 
construction of 2. The greatest of the m numbers y;,° - -,ym can be defined 
as the maximum y(Q) of the Hermitian form (3.3) for }|&|?<1. 

i 


d 
e 
18 
1e 
1 | 
n 
le 
70 
0 
| 
| 


196 HERMANN WEYL. 


The contributions 


(3.4) = Dau gu and || | |? = a |? 


of © to f and || f || ? are clearly independent of the choice of the orthogonal 
basis g") for the upright Q-rows. Hence in the final statement we shall 
ignore this normalization of G[é]. Nor do the contributions (3.4) change 
if O is replaced by a unitary-equivalent representation, i.e. if each of the m 
rows g,---,g ™ undergoes the same unitary transformation A. The 
sum > || e(Q) || * extending over any set of inequivalent irreducible unitary 0 
can not exceed || f || ° (Bessel’s inequality). It is thus further clear that the 
construction of 2 can not help yielding a complete basis (i= 1,- +, m) 
of the upright Q-rows for each Q. Otherwise the sum (2.23) would fall 
short of || f ||? at least by the contributions of the missing g“). We sum- 
marize our findings in a preliminary Statement and a Theorem. 


STaTEMENT. Starting from a given a.p. vector f, our construction 
accomplishes the following. For every irreducible unitary representation Q 
of o, degree h, it determines a complete orthogonal basis g™,- - -,g'™ of 
those Q-rows g = (9:,° - *,9n) which are perpendicular to all hidden Q-rows, 
and it picks out a complete set of inequivalent Q’s that actually occur, t.e. 
for which m>0. One has m Sh, and-the vectors 


p=i,---,h) 


lie in 3/°. [They are even of the special form f(s) - sf where $(s) is not 
only f-continuous but satisfies a strong Lipschitz condition with respect to f, 
(2.1).] The gu for two inequivalent irreducible unitary representations 
Q are mutually orthongonal. The contribution of Q to f is the sum 
e(Q) = Say formed by means of the Fourier coefficients 
= f), the contribution of to || f equals || e(Q)| 7 | a |* 
The maximum of | |? for &|*S1 ts introduced as the 
i 


weight y(Q) with which Q occurs in f. 


THEorEM. The sum of the contributions > || e(Q)|| ? extending over the | 
denumerable sequence of inequivalent irreducible unitary Q actually occurring | 
in f equals || f || %. An explicit estimate of the convergence can be given iff 


the terms are arranged by descending weights y(Q). 


It is perhaps not justified to speak of a completeness relation, since 4 
host of “hidden” invariant sets are left in the dark, for the good reason | 


that they do not contribute to f and || f || *. 


Ser, 


0 
is 
Th 
(3. 
anc 
(3. 
the 
uni 
for 
cor 
four 


son 
_ Vectors, subdivided into sections that form primitive invariant sets, has been 


found such that the Fourier series 0191 + %g2 +--+ with the coefficients 


ALMOST PERIODIC INVARIANT VECTOR SETS. 197 


For the “interpretation by scalar functions on IL” the situation is a 
little simpler. Here, according to an observation made by E. Cartan 
[8, cf. also 11], the number of linearly independent Q-rows is at most h for 
any irreducible unitary representation Q of degree h. We choose an orthogonal 
basis g“,- + -,g() for them in order to determine the contribution e(Q) 
to f, without rejecting the hidden ones. But even then f must be used for 
picking out the denumerable sequence of ’s that actually contribute to f. 

A subgroup o° of o may be treated directly by observing that f is also 
almost periodic with respect to o°. This is a better procedure than by 
breaking up the o-invariant sets obtained by our construction into primitive 
o°-invariant sets. For then one would still face the task of getting rid of all 
but the upright ones. 

In the Main Theorem one can readily pass from a single given a. p. 
vector f to a finite number, or even a denumerable sequence, of such vectors, 
fife An Q-row g’ is f-hidden when (9’p,f) =0 
For any v1,2,-- - we construct an orthogonal basis g(,- - -, 
for those Q-rows g which are perpendicular to all f,-hidden Q-rows 2’, 
(g’,g) = 0, and moreover satisfy the relations 


Ju) =9,° +, Gu) = 0 (u=1,:--,h). 


Of course m, = h. If m,, mo, all vanish then Q “ does not contribute.” 
Otherwise the whole (finite or infinite) sequence of Q-rows 


is orthogonal. For any vector f we may determine the Fourier coefficients 
(gy), f) and form 


ev(f; 2) | | | (u = 1,- My). 
The contribution of © to || f || ? is defined by 
(3.5) ll e(f;Q)]* =] 
and 


the sum extending over a complete set of inequivalent contributing irreducible 
unitary Q. For f =f, the sum (3.5) is finite, || eu(f»;Q)|| ? equaling zero 
foru=v-+1,v+2,---, and the Bessel inequality (3.6) changes into the 
corresponding equation. As a result, an orthogonal sequence 9;,92,° - - of 


f, 
m 
i) 
he 

he 
ng : 
if 


198 HERMANN WEYL. 


% == (gn, f) converges weakly to f for each of the given a.p. vectors 


f= fi 


4. Strong approximation. Interpretation by vector functions on II. 
A further axiom connecting “length” |g| with “modulus” || g || is needed 
for the transition from weak to strong convergence. Let us study the integral 
ge = § €(s)*sg in which €(s) is any f-continuous numerical function and 
the vector g of such nature that sg is an f-continuous vector function of s. 
In the interpretation by scalar functions on II this relation reads 


ge(P) = f €(s) -g(s"P), 


and thus 


2. f | g(s7P)| 2. 


8 


| = f | €(s) 


The second factor on the right is independent of P and has been denoted 
by f |g(P)|?=|9 If stands for f | €(s)|°, we therefore arrive 
P 8 


at the inequality 


| g¢|—=1.u. b. 


The special case suggests the following general axiom: 

Axiom IV. Under the conditions specified above the integral 
ge = satisfies the inequality 
(4.1) 


I do not deny its arbitrary character; one would wish to reduce it to 
some simpler assumptions. But two things can be said for it: (1) It holds 
in any metric vector space & provided one identifies |g| with || g ||. Indeed 
for g(s) = sg the inequality (C2) gives 


If €(s) gl’. 
(2) It holds for the interpretation by scalar functions on II. 


The following theorem is an immediate consequence of the new axiom. 


E(s) being an f-continuous function, the Fourier series of the vector 


y= &(s) sf, 


tn = (Gn, 


8 
8 
‘ 
a 
f 
ne 
n 


ALMOST PERIODIC INVARIANT VECTOR SETS. 199 


converges strongly towards y. In the sum members of the same set may 
n 
not be separated. 
Let us indicate the contribution of the first p invariant sets, 
etet:::+t epi, by -}p and carry out our calculations for one 
of the sets, to which the old notations 


may refer. We have : 
(4. 2) = = Mp Jv (Gv» Sf) Jv» 
therefore 
&(8) = 9) Iv = 
and 


where f‘?) denotes the remainder f—{e-+---},. The formula (4.2) for 
se shows that 


and thus 
with A=~1+ Thus sf is an f-continuous vector 


function of s, and our axiom yields for the remainder in (4.3) the estimate 
& 


But one knows that || f'”) || 68, tends to zero with p> o. 
THEOREM OF STRONG APPROXIMATION. f can be strongly approximated 
n 
with arbitrary accuracy by finite sums of the form > @ gx. 
k=1 


The numerical function |sf—f|—p(s) is f-continuous since 
|p(s) —p(t)| S|sf—tf|. Choose any «>0 and let I(p) be a non- 
negative uniformly continuous function of the variable p (= 0) which vanishes 
for p=e and for which the integral of the f-continuous function I(p(s)) 
=A(s) is 1. Form the difference 


SACs) -sf—f= f A(s) (sf—f). 


yp 
) 
~) 
| 
8 


200 HERMANN WEYL. 


Considering that I(p)-p €-l(p) for p=0, namely both for OSp<e 
and for p =e, one gets 


fA(s) -sf—f]S fa(s)-[sf—fl Se fa(s) =e 


But the Fourier series of y= f A(s) - sf converges strongly to y, and hence 
we obtain coefficients m= (gx,y) and an n= mp, such that 


n 
ly— 2’ |S 


The result is a strong approximation of f with the accuracy 2e¢. 

It follows readily that not only f, but every vector in %;° can be 
approximated in the same manner. 

It should be noted that the Axiom IV in general does not carry over 
from the group o to a subgroup o°. 

In the interpretation by functions f(P) on II it seems unnatural to 
limit oneself to the case where the value of the function is a number; it may 
itself be a vector with several components. So it is in physics, where the 
electromagnetic field strength has 6 and the electronic y has 2 or 4 com- 
ponents. We may even admit the value f(P) to be a vector in a metric 
vector space of infinitely many dimensions. Let us therefore now assume that 
we are given a metric vector space % (Axioms I and IT) closed in the Hilbert 
sense, so that any weakly convergent sequence fi, f2,° || fu—fm || 9, 
converges weakly toward a vector f, || fn—f || 0. (Should the metric space 
not be closed, we make it so by Cantor’s construction.) Suppose, moreover, 
we are given a point field II and a realization of the group o by a transitive 
group of transformations P—>sP of this point field (Axiom I,). We study 
functions f which associate with every point P of II a vector f(P) in &. 
The transform sf is then to be defined as the vector function associating 
with P the vector sf(s*?P). But it seems convenient to define a transform 
(s,t)f for any pair of elements (s,¢) of o by ascribing to (s,t)f the value 
sf(t*P) at the point P. The pairs (s,¢) form the group oxo in which v 
itself is contained as the subgroup of the diagonal pairs (s,s) ; sf = (s, s)f. 
The restriction of almost-periodicity to be imposed upon f is twofold: 


1. Let us say that the element ¢ of o lies in-the f-disk D;(b;«) of 
radius e around 6 if || f(¢°*P) —f(b"P)|| Se for all points P. We assume 
that for every « > 0 the group o may be covered by a finite number of such 
f-disks D;(bx3€) (k=1,---,n) of radius «. Choose a definite point Po. 
For an f satisfying this condition the numerical function || f(¢*Po)|| of ¢ 


va 


& 8 8 
( 
C 


ALMOST PERIODIC INVARIANT VECTOR SETS. 201 


and hence the function || f(P)|| of P is bounded. We define the length | f | 
of f by 
[f[=Lub. 


This length is invariant with respect to all transformations (s,t) of oxo, 

| =| |. in particular to all transformations (s,s) of o. For any 

two vector functions f(P), g(P) satisfying our condition the scalar product 

v(P) = (g(P),f(P)) is an almost periodic numerical function on II, and we 

may therefore form the constant f y(¢"P) = f y(P) and introduce it as the 
t P 


scalar product (g,f). Again invariance prevails: ((s, t)g, (s, ¢)f) = (g, f). 


2. Let us say that the element s lies in the f-circle C;(a;¢) of radius 
around «a if || sf(P) —af(P)|| Se for all points P, and now assume that s 
can be covered not only by a finite number of f-disks D;(b;.; €) (k =1,- - -,n), 
but also by a finite number of f-circles C;(ai;e) (t=1,--+,m) of arbi- 
trarily small radius e. If s lies in C;(a;;¢) and ¢ in Dys(bx3€) we have 

— af (bP) | 
< | sf(P) —af(P)| + | af —af (bP) | 
Set | — P) || S 2, 
or |(s,t)f — (ai, bx) f | S 2. Hence f is almost periodic with respect to 
the group o xo of the pairs (s,¢) and a fortiori with respect to the sub- 
group o of the diagonal pairs (s,s). 
All the axioms J — III are satisfied for the space & of the almost periodic 


vector functions f, both with respect to «xX a and to «. Also the Axiom IV 
introduced in this section holds. Indeed consider 


ge(P) = f f &(s,t) -sg(tP). 
t 
One finds by applying (C.) to the (s, ¢)-means, 


The second factor on the right equals 


Hence | g¢|= || é||- | g ||. The same argument applies to a simple mean 
value of the type 


t 
8 


202 HERMANN WEYL. 


5. Appendix. On Maak’s approach to the theory of almost periodic 
functions. 


1. The marriage problem. Given distinct objects {a} = a, 
(boys) and n distinct objects {b} = b,,- - -,b, (girls) ; moreover a scheme 
of linkage Q, according to which an a; and a by are either linked (friends) 
or not linked. I call the number of elements in a set its rank. <A set B of 
girls is said to be associate to a given set A of boys if no boy in A has friends 
outside the set B. (Then the complement {a}— A is in the same sense an 
associate of {b} —B.) Question: What is the necessary and sufficient con- 
dition that the boys can be paired with the girls in such a fashion that in 
each of the n pairs the partners are friends? The following basic condition 
is obviously necessary: A set A of boys has never an associate set B of girls 
of lesser rank than A. The fundamental combinatorial lemma asserts that 
this condition is also sufficient [9, 4, 6]. 


Proof. I let the girls b,,- - -,b, choose their partners one after the 
other, and therefore ask b, first to make her choice. Suppose she chooses a3. 
The choice should be fair to herself, i.e. a; and b, should be friends. But it 
should also be fair to the other girls by not making it impossible for them 
to find partners among their friends, i.e. the linkage scheme Q,_; arising 
from Q, by removing a; and b, should still satisfy the basic condition. Call 
a set A of boys of rank r (=1) distinguished if it has an associate set 
B= (bi, by.,° + +, bx.) of girls that is of the same rank r and contains ),. 
Then the second postulate requires that there should be no distinguished 
set A which does not contain a;, or a; must be in the intersection of all 
distinguished sets. 

We make the basic observation that the intersection of two distinguished 
sets is again distinguished and is not empty. Indeed let A; be a distinguished 
set of rank r; and B; an associate of the same rank (11,2). Let 
A =A, Az be of rank r and B=B,f{)B:. of rank s. The rank sg is at 


least 1, since B contains b,. B is an associate of A, hence rs. Moreover’ 
3 


B, U Bz is an associate of A, ) Ao; thus + orras. 
The resulting equation rs together with s =1 proves the point. 

Let Ao = (ai,,° be a distinguished set of lowest rank m= 1 
and (b,, +, its associate of the same rank. A, is contained in 
every distinguished set A (and therefore unique) since A {] A> can not be 
of smaller rank than Ay. At least one of the boys aj,,- + +, a@i,, in the set Ao 
is a friend of b,; for otherwise Ay would have an associate set (by.,° * +, Dim) 


of rank m —1. 


ALMOST PERIODIC INVARIANT VECTOR SETS. 203 


Let therefore }, pick one of her friends a, in Ao, and rearrange the 
sequence * *, so that her mate a, assumes the first place. Then 
the linkage scheme of @2,° with + satisfies the basic 
condition. By induction the girls thus solve their marriage problem. 


2. The objects to which the combinatorial lemma will be applied are 
pieces o; of the group o, and linkage of a piece o’ and a piece 7’ will mean 
that they have points in common. [Contrary to Maak, we do not forbid the 
roof-tiles o; to overlap.| Let f be an a. p. vector and $(s) an f-continuous 
numerical function, so that for every 8>0 there is an e>0O such 
that |sf—itf|Se« implies | ¢(s)—¢(t)| <8. Then it implies also 
| (as) —¢(at)| <8 for every element a. Therefore the function $(s) is 
almost periodic in the sense that for every 8 one can cover o by a finite number 
of tiles o; such that | d(as) —¢(at)| <8 for any s,¢ in the same piece o; 
and any a. By a simple argument, which I shall not repeat here, Maak 
infers from this one-sided the two-sided almost-periodicity. 

Let us say that s lies in the circlet of diameter p > 0 around 8» if 


| d(asb) — (asob)| dp for all a and 0b. 


Our statement means that a finite number of elements cx (K =—1,---,N) 
may be ascertained such that the circlets ox of diameter p around the cx 
cover the entire group. Choose the smallest number N of elements c« 
satisfying this condition. We speak of them as a p-lattice and of 


Lp = N*: > $(cx) 
k 


as the lattice average of ¢. (We differ from Maak by using no other domains 
but cirelets. It is a task of fundamentally simpler nature to determine the 
minimum number of points which have a given property than the minimum 
number of domains.) Submit the lattice circlets ox to an arbitrary left- 
translation a, sas, ox—>tTK—=Aox. The rx are circlets of the same 


diameter p. For any r lattice circlets ox,,: - -,ox, it will never happen that 
those rx that are linked with ox, or ox,* * * oY ox, are less numerous. For 
then we could replace the circlets ox,,- - +,ox, by these rx that are linked 


with them, and thereby lower the total number NV. Our combinatorial lemma 
shows that the ox and the 7x may be paired (rx, ox) in such a way that rx 
and ox overlap. 

The centers dx = acx and cx of two circlets of diameter p which overlap 
satisfy the inequality 


| $(dxb) — $(cxb)| Sp 


) 

¥ 
A 


204 HERMANN WEYL. 


for all b. Consequently 


(5.1) | —N*¥ $(cxd)| Sp 
for arbitrary a and 3b, in particular 
(5.2) | N* 2 —N* Sp. 
By the same token the right-translation s—> sb yields the inequality 
(5. 3) | o(cxd) —N* <p. 
Let aj (1. =1,--+-,n) be any number of elements and a; =0 corre- 


sponding weights of sum 1. The average A = > ai¢(a;) was said to oscillate 
by less than e« if 
| —A|Se 


identically in a. Let A, B be two such averages of ¢ oscillating by less than 
« and 8 respectively. We then have 


(5. 4) — A | 
for each cx of our p-lattice. On the other hand (5.3) gives for b=a;: 
(5. 5) | NE —Lp | <p. 
(5.4) and (5.5) lead to 


For the same reason 
| B | = 6 + p; 


consequently |B—A|Se+8-+ 2p. This must be true for every p> 0, 
which is impossible unless 
|B—A|Se+6, 
cf. formula (1.4). 
It is this fact on which the existence of the mean value J = f ¢(s) 
depends; any average A oscillating by less than e differs from it by not more 
than e. According to (5.2) and (5.1) the lattice averages 


Lp<o> = 2 o(cx) and = N* $(cxb) 


u 


ALMOST PERIODIC INVARIANT VECTOR SETS. 205 


of ¢(s) and ¢o(s) = ¢(sb) oscillate by less than p. Among themselves they 
differ by not more than p, (5.3). This proves that the absolute difference 
of f ¢(sb) and f ¢(s) can not exceed 3p, and as p is arbitrary, these two 
mean values must coincide. 


INSTITUTE FOR ADVANCED STUDY. 


REFERENCES. 


[1] S. Bochner and J. von Neumann, “ Almost periodic functions in groups,” 
Transactions of the American Mathematical Society, vol. 37 (1935), pp. 21-50. 

[2] H. Bohr, “Zur Theorie der fastperiodischen Funktionen I,” Acta Mathe- 
matica, vol. 45 (1925), pp. 29-127. 

[3] E. Cartan, “Sur la détermination d’un systeme orthogonal complet dans un 
espace de Riemann symétrique clos,” Rendiconti del Circolo Matematico di Palermo, 
vol. 53 (1929), pp. 217-252. 

[4] P. Hall, “On representatives of subsets,” The Journal of the London Mathe- 
matical Society, vol. 10 (1935), pp. 26-29. 

[5] Anna Hurevitsch, “ Unitary representations in Hilbert space of a compact 
topological group,” Recueil Mathématique (Matematicheskii Sbornik), vol. 13 (1943), 
pp. 79-86. 

[6] W. Maak, “Eine neue Definition der fastperiodischen Funktionen” and 
“Abstrakte fastperiodische Funktionen,” Abhandlungen aus dem Mathematischen 
Seminar der Hamburgischen Universitit, vol. 11 (1935), pp. 240-244, and vol. 11 (1936), 
pp. 367-380. 

[7] J. von Neumann, “ Almost periodic functions in a group I,” Transactions 
of the American Mathematical Society, vol. 36 (1934), pp. 445-492. 

[8] F. Peter and H. Weyl, “ Die Vollstindigkeit der primitiven Darstellungen 
einer geschlossenen kontinuierlichen Gruppe,” Mathematische Annalen, vol. 97 (1927), 
pp. 737-755. 

[9] R. Rado, “ Bemerkungen zur Kombinatorik im Anschluss an Untersuchungen 
von Herrn D. Kénig,” Sitzungsberichte der Preussischen Akademie der Wissenschaft, 
vol. 32 (1933), p. 68. 

[10] H. Weyl, “Integralgleichungen und fastperiodische Funktionen,” Mathe- 
matische Annalen, vol. 97 (1927), pp. 338-356. 

[11] H. Weyl, “ Harmonics on homogeneous manifolds,” Annals of Mathematics, 
vol. 35 (1934), pp. 486-499. 


e 
), 
) 


A CRITERION FOR THE NON-DEGENERACY OF THE WAVE 
EQUATION.* 


By Puinie Hartman and WINTNER. 


The results of this paper are similar to those of the first part of [5] 
but go in another direction. Correspondingly, the knowledge of [5] will 
not be presupposed. 

For large positive ¢, let f= f(t) be a real-valued, continuous function. 
By solutions «= 2(t) of the linear differential equation 


(1) 2” + f(t)e—0 


will be meant only real-valued solutions. Such a solution will be called of 
class (L*) if 


(2) f << 0. 
In terms of the (continuous, non-negative) function 


(3) fr(t) = max (f(t), 0), 


the following theorem will be proved: 
(*) If, as to, 


(4) f(t) = Or(t*), =O(#), 


or, more generally, if 


t 
(5) 
then (1) cannot have two linearly independent solutions of class (L?). 
According to Weyl, (1) cannot have two linearly independent solutions 
of class (LZ?) if 
(6) f(t) = Or(1), ie, f(t) = O(1) 
(cf. [9], p. 238). No improvement of (6) is known to us from the literature. 


In view of the particular case (4) of (5), the improvement of (6) to be 
obtained is of the order 7@*. 


* Received June 1, 1948. 
206 


I 
0 
( 
( 


THE NON-DEGENERACY OF THE WAVE EQUATION. 207 


The result is final, in the sense that the 7? in (4) cannot be improved 
to #?*€ (for any « > 0). In order to see this, it is sufficient to apply to the 
general solution of 


(7) a” + 0, 


an asymptotic formula (cf. [10]), which implies that every solution of (7) 
is O(t-3#¢) and, therefore, of class (LZ?) (actually, (7) is merely the normal 
form of Bessel’s equation). 

The proof of (*) will depend on an adaptation of the method of “ arcus 
variation ” (cf. [6], [8], [4]). It consists of an estimate of N(¢), where, 
if f(t) is real-valued and continuous on the half-line0 St < o andz=2(t) 
denotes any non-trivial solution (#0) of (1), the function V(t) is defined 
as the number of zeros of x(s) on the interval Ost. The choice of the 
solution x(t) #0 with reference to which N(t) is estimated is immaterial. 
In fact, Sturm’s separation theorem implies that the N-functions belonging 
to different non-trivial solutions of (1) cannot differ by more than + 1 for 
any value of ¢. Hence, the assumption, (8), of the following theorem is a 
property of the coefficient function f of (1) alone. 


(*) If 
(8) lim inf V(t) /t? < 0, 
t-—00 
(for instance, tf 
(8 bis) N(t) = O(#) 


as t—> oo), then (1) cannot have two linearly independent solutions of class 
(L?). 

It was shown in [2] that (1) cannot have two linearly independent 
solutions of class (L*) if 


(9) N(t) = O(1). 


The improvement of (9) to (8 bis) corresponds to that of (6) to (4). The 
result is again final, in the sense that the ¢? cannot be improved to ¢**¢ for 
any e > 0. This follows again from the asymptotic formula for the solutions 
of (7). 

This does not mean, of course, that the assertion of (**) must fail if 
(8) is violated. In fact, if 


(10) f(t) = # log? t, (t>1), 


208 PHILP HARTMAN AND AUREL WINTNER. 


then, on the one hand, the result of [4] implies that 


aN (t) ~f slog sds ~ log t 


and, on the other hand, the asymptotic formula of [10] shows that no non- 
trivial solution of (1) is of class (Z*) in the case (10). 
It turns out that (*) is a corollary of (**). This situation is due to the 


following 


Lema. f(t), whereOSt < o, is a real-valued, continuous function 
and if f*(t) is defined by (3), then 


(11) N(t) = O(t #*(s) ds)! + 0() 


holds for the number of zeros of every non-trivial solution of (1). 


Since (11) shows that (5) implies (8) [and even (8 bis) ], it will be 
sufficient to prove the Lemma and (**). 


Proof of the Lemma. It is clear from (3) and from Sturm’s comparison 
theorem that, if N*(¢) belongs to 


xv” + 
in the same way as V(t) belongs to (1), then 
N(t) SN*(t) +1. 


This implies that it is sufficient to prove the Lemma under the assumption that 
f=f*, i.e., that 
(12) f(t) 20. 


In other words, the assertion of the Lemma is that (12) is sufficient for 


(13) N(t)=O(t f(s)ds)! + O(1). 


The truth of the latter implication is a corollary of the following result 
of Beurling, quoted by Borg [1], p. 1: If f(t) is real-valued and continuous 
on an interval ax<¢<b, then no solution x(t) 0 of (1) can have 


more than one zero on a= t= unless 


| 
( 
| 
| 
he 
0 
(1 
(1 
de 


THE NON-DEGENERACY OF THE WAVE EQUATION. 209 


(14) f — a) 


In order to deduce from this the estimate (13) in the case (12), let 
a(t), where 0=¢ < o, be that solution of (1) determined by the initial 
conditions 7(0) = 0 and 2’(0) and let O=t,<t, denote the 
zeros of this solution. It can be assumed that this sequence is infinite, since 
otherwise V(t) = O(1), and so (18) is trivial. Since the solution 2x(¢) 
has two zeros on the intervala=t b if a=t,_, and b = tn, where n is any 
positive integer, it follows that (14) must now hold for every n. In view of 
(12), this implies that 


JS ta). 


On the other hand, the harmonic mean of n positive numbers is majorized 
by their arithmetic mean, 


n n 
k=1 k=1 
finally, 


D (ti — tea) = tr, 
ket 


since tj =0. The last three formula lines imply that 


tn 


tn > ane. 
0 


In view of the definitions of ¢, and N(t), the proof of (13) is now complete. 
Since this proves the Lemma, only (**) remains to be proved. 


Proof of (**). Let and x= y(t) be two linearly independent 
solutions of (1), where f(t) is any real-valued, continuous function on the 
half-line OSt< co. Since the Wronskian of x(t) and y(t) is a non- 
vanishing constant, it can be assumed that 


15 ry’ —a’y=1. 
y 
In particular, and y cannot vanish at the same ¢, and so the substitution 
(16 r=rcos 8, y =rsin 6 
y 


determines a unique positive r—r(t) and, if 0 (@)1-.0< 2m, a unique 


14 


b 
a 
tn 
0 
n 

] 


210 PHILP HARTMAN AND AUREL WINTNER. 


continuous 6=6(t) for OSt< o. It follows that 6(¢) has a continuous 
derivative. In addition, 


(17) > 0, 


since (15) and (16) imply that r°6’—1. If the latter identity is written 


in the form 
=1/6, 


and if it is observed that the assertion of (**) is equivalent to the statement 
that not both x(t) and y= y(t) are of class it follows that (**) 
will be proved if it is shown that 


(18) f dt/0’(t) 


is implied by the assumption, (8), of (**). 
Since 6’(¢) is continuous and positive, its harmonic mean on any interval 
is majorized by its arithmetical mean; so that, if O0Sa<a-+t, 


a+t a+t 
(7 f (s)ds. 
a ‘a 
Since the integral on the right is identical with 6(a + ¢) — 6(a), it follows 
that 


att 


= #/{6(a +t) —6(a)}. 


Tf the numerator and denominator of the function on the right of this 
inequality are divided by (t+ a)°, it follows, by letting f{— o while a is 
fixed, that 


(19) f ds/@’(s) t?/0(t). 


If (18) is false, then the integral on the left of (19) tends to 0 as 
a—> oo. On the other hand, the upper limit on the right of (19) is non- 
negative and independent of a. Hence, if (18) is false, the upper limit on 
the right of (19) must be 0, i. e., the estimate 


(20) lim inf 6(¢)/t? < 


cannot hold. Consequently, (18) cannot be false if (20) is true. 


0 
= 
( 
fe 
a 

f 
T 


THE NON-DEGENERACY OF THE WAVE EQUATION. 211 


It follows that the proof of (**) will be complete if it is shown that (8) 
implies (20). 

To this end, recourse must be had to the definition of N(¢). In view 
of (16) and (17), this definition implies that the integral part of the ratio 
6(t)/m differs from the integer N(¢) by not more than 1. Hence, 


=2N(t) + O(1), 
and so (20) is implied by (8). 


Appendix. 


The criteria (*), (**), as well as all other known criteria applying in 
this direction (cf. [5]), seem to suggest the truth of the following conjecture: 
If f(t) and g(t) are continuous functions satisfying the inequality 


(1) f(t) Sg(t) 

for large positive ¢, and if 

(2) x” + 9(t)e—0 
possesses a solution which is not of class (L*), then 
(3) a” + f(t)e=0 


also has a solution which is not of class (Z*). The object of this Appendix 
is the construction of an example disproving this conjecture. 

If f(t) =e*, where C is any positive constant, it follows either from 
asymptotic formulae (cf. [10]) or from explicit integrations, that every 
solution of (3) is of class (Z*). Hence, it is sufficient to show that there 
exist functions g(¢) which satisfy the inequality 


(4) g(t) 


(for some C > 0 and large ¢) but are such that not every solution of the 
corresponding differential equation (2) is of class (L’). 

It will be convenient to first choose such a g(t) to be a suitable step- 
function (in this connection, ef. [7], p. 50) and then to ascertain that a 
removal of the jumps of this discontinuous g(t) has no effect on the result. 
Such a step-function on g(t) can even be chosen monotone, as follows: 


(5) g(t) = if t = a, 


where 


5 
3 


212 PHILP HARTMAN AND AUREL WINTNER. 


k 
(6) a, = 22 Sj? if k—1,2,--- and a—0; 
j=1 


so that g(t) is defined for 0 << t< o. 

Since this step-function is monotone, (4) will be proved for continuous 
¢ if it is proved for tax. In view of (5), this requires the existence of a 
C>0 satisfying the inequality 


= 


But the existence of such a ( is assured by the fact that 


a, ~ 2x log k, 


by (6). 
Next, (5) and (6) show that (2) reduces to 


xe’ + =0 for a, << tlm 
and admits, therefore, the solution 
(7) a(t) =cosk(t— az) for tS a. 


The function x(t) defined by (7%) for 0 < ¢< oo is continuous and has a 
continuous first derivative, even at the points t=a,. This follows by 


observing that 


by (6). Furthermore, by (7) and (6), 


ax 

J = ks ds = 

Ora 0 
Hence, from (6), 

(8) f as t> 0, 
0 
oo 


and so the (Z*)-condition f x?(s)ds << © is not satisfied. 


All that remains to be ascertained is that the jumps of g(t) can be 
smoothed out without affecting the result. But this follows, for instance, 
from (8) and the argument applied in the footnote in [3], pp. 396-397. 


THE JOHNS HOPKINS UNIVERSITY. 


THE NON-DEGENERACY OF THE WAVE EQUATION. 213 


REFERENCES. 


[1] G. Borg, “ Ueber die Stabilitaét gewisser Klassen von linearen Differentialgleich- 
ungen,” Arkiv fiir Matematik, Astronomi och Fysik, vol. 21 A, no. 1 (1944). 

[2] P. Hartman, “On differential equations with non-oscillatory eigenfunctions,” 
Duke Mathematical Journal, vol. 15 (1948), pp. 697-709. 


[3] » “On a theorem of Milloux,” American Journal of Mathematics, vol. 70 
(1948), pp. 395-399. 

[4] and A. Wintner, “ The asymptotic arcus variation of solutions of linear 
differential equations of second order,” American Journal of Mathematics, 
vol. 70 (1948), pp. 1-10. 

[5] and A. Wintner, “Criteria of non-degeneracy for the wave equation,” 


American Journal of Mathematics, vol. 70 (1948), pp. 295-308. 

[6] T. Levi-Civita, “Sur les équations linéaires 4 coefficients périodiques et sur le 
moyen mouvement du noeud lunaire,” Annales Scientifiques de lV’ Ecole Normale 
Supérieure, ser. 2, vol. 28 (1911), pp. 325-376. 

[7] H. Miloux, “ Sur l’équation différentialle 2” + A(t)#=—0,” Prace Matematyczno- 
Fizyczne, vol. 41 (1934), pp. 39-54. 

[8] W. E. Milne, “ On the degree of convergence of expansions in an infinite interval,” 
Transactions of the American Mathematical Society, vol. 31 (1929), pp. 
906-918. 

[9] H. Weyl, “ Ueber gewéhnliche Differentialgleichungen mit Singularitaéten und 
die zugenhérigen Entwicklungen willkiirlicher Funktionen,” Mathematische 
Annalen, vol. 68 (1910), pp. 222-269. 

[10] A. Wintner, “On the normalization of characteristic differentials in continuous 
spectra,” Physical Review, vol. 72 (1947), pp. 516-517. 


1 
a 
y 
ye 


ON THE LOCATION OF SPECTRA OF WAVE EQUATIONS.* 


By Puitie Hartman and AvurREL WINTNER. 


Let f(t) be a real-valued, continuous function defined for large positive s, 
and let A be a real parameter. For every fixed A, only real-valued solutions 
x= #0 of 
(1) x” + (f(s) 


will be considered. Such a solution will be called of class L? if 


co 


(2) f x*(s)ds << 


The following theorem, particular cases of which are often stated 
(apparently for intuitive reasons) in the physical literature (cf., e.g., [3], 
p- 72 and p. 83), will be proved: 

(I) 
(3) lim sup f(s) < « 


8-0 


and if a solution x= 2x(s) of 


(4) x” + (f(s) 
satisfies 
(5) a(s) =O(1), (s—> 0), 


then either (2) holds or X=Xp ts in the essential spectrum of (1). 


In order to explain the terminology in this assertion, suppose that f(s) 
is continuous for 0 = s < oo and that there is assigned to (1) an homogeneous 
boundary condition at s 0, such as 7(0) —0 or, more generally, 


(6) x(0) cosa + 2’(0) sn 


(so that 2(0) —0 corresponds to «=0). The boundary condition (6) 
belonging to a fixed « determines for (1) a spectrum S = S(«) and a point 
spectrum P = P(«) if (1) is of the Grenzpunkt type, that is, if, for some d 
(but then for every A), not all solutions of (1) satisfy (2); cf. [6], p. 238. 


* Received June 7, 1948. 


214 


1 
t 


|, 


ON THE LOCATION OF SPECTRA OF WAVE EQUATIONS. 215 


It is known ([6], p. 238) that assumption (3) is sufficient in order that (1) © 
be of the Grenzpunkt type. It is also known ([6], p. 251) that the set 
consisting of the cluster points of S(a) is independent of the choice of «a. 
What in (I) is referred to as the essential spectrum is this cluster set which, 
being independent of a, can be denoted by S’. It is contained in S(«), since 
the latter set is closed. Since P(a) also is contained in S(a), it follows 
that (1) can be restated as follows: 


(Ibis) On the half-line OSs < 0, let f(s) be a real-valued, con- 
tinuous function satisfying (3) and let x =2(s) 40 be a solution of (4) 
satisfying (5). In terms of this x(s), define « (mod z) by (6). Then A=Xo 
is in the spectrum S(a) determined by (1) and (6). 


It remains undecided whether or not (I bis) remains true if its assump- 
tion (3) is generalized to the mere requirement that (1) and (6) determine 
an eigenvalue problem, that is, that (1) be of the Grenzpunkt type. In the 
physical situations occurring in wave mechanics, not only do the potentials 
satisfy the unilateral restriction (3) but even 


(7) f(s) =9O(1), (so). 


An illustration of the content of (Ibis), in the particular case (7), is 
afforded by the case of a continuous, periodic f(s), where OSs < o; ef. 
[5]. Needless to say, the assertion of (Ibis) is known in those cases in 
which f(s) is so “regularly small” for large s that the asymptotic behavior 
of the general solution of (1), for large s and for arbitrarily fixed A, can be 
obtained from asymptotic formulae of standard types. 

It should be noted that the two cases admitted in the alternative state- 
ment of (I) are not, in general, mutually exclusive. Hxamples to this effect 
can be written down by employing general criteria (cf. [8], p. 269 and [4]). 

The first of the cases admitted under the alternative of (I) can be 
ruled out if it is assumed that (5) is satisfied by every solution of (4). 
This is the content of the following theorem: 


(Il) Jf (8) ts assumed and if (4) has two linearly independent solu- 
tions satisfying (5), then no solution of (4) satisfies (2) and X= po is in 


the essential spectrum of (1). 


Corresponding to the remark following (Ibis), it remains undecided 
whether or not (II) remains true if its assumption (3) is generalized to 
the mere requirement that (1) be of the Grenzpunkt type. 

In the proofs, the following facts will be needed: 


1S 
d 
s) 

Ous 
6) 

yint 
er 

38. 


216 PHILP HARTMAN AND AUREL WINTNER. 


If f(s) satisfies (3) and if Ao is arbitrary, then, as s—> %, a solution of (4). 


(1) cannot be of class (L*), unless it is 0(1) ; 
(ii) cannot be 0(1) unless the first derivative of the solution ts 0(1); 


(ili) cannot be O(1) unless the first derivative of the solution is O(1). 


The assertions (i) and (ii) were proved in [7], p. 8, and [1], pp. 324- 
325, respectively, and a glance at the proof of (ii) shows that (iii) follows 
by exactly the same argument as (ii). 

It may be mentioned that if the unilateral restriction (3) is strengthened 
to (7), then (4) shows that the respective assumptions, x(s) =o0(1) and 
x(s)=O(1), of (ii) and (iii), imply that 2”(s) =O(1) and so the 
respective assertions, 2’(s) =o0(1) and a’(s) = O(1), of (ii) and (iii) are 
contained in Hadamard’s standard Tauberian lemma. But his lemma does 
not apply if his bilateral assumption (7) is relaxed to (3). 


Proof of (Ibis). For a fixed g in (6), let r=2(s) #0 be a solution 
of (4) and (6). If this x(s) is of class (7), then A =A, is in the point 
spectrum, P(«), and therefore in the spectrum, S(a). If x(s) is not of 
class (Z?), suppose that it satisfies (5). It will be proved that A =A, must 
then be in the essential spectrum, 8’. 

It was shown in [2] that (whether or not (3) is satisfied) every A-value 
not contained in S’ is such that the equation (1) belonging to this A-value 
possesses a solution «= y(t) 0 which is of class (Z*). Hence, in order 
to complete the proof of (Ibis), it is sufficient to show. that if (3) is 
assumed of f(s), and (5) of a solution z = 2(s) #0 of (4), and if c= y(s) 
is a solution of (4) linearly independent of r(s), then y(s) cannot be of 
class (Z?). 

Suppose the contrary. Then, y(s) being of class (1.7), it follows from 
(i) and (ii) that 


(8) y(s) = 0(1) and y/(s) =o(1). 
On the other hand, (5) and (iii) show that 
(9) a’(s) = O(1). 


Since (5), (8) and (9) imply that the Wronskian of x(s) and y(s) is 0(1), 
and since the Wronskian of two solution of (4) is a constant, it follows that 
the latter constant is 0. But this contradicts the assumption that «#(s) and 
y(s) are linearly independent solutions of (4). 


Proof of (11). It was shown in [9], pp. 23-24, that, if (3) is assumed, 


ON TIE LOCATION OF SPECTRA OF WAVE EQUATIONS. 217 


and if all solutions of (4) and their first derivatives satisfy (5) and (9), 
then A =A, is in the essential spectrum of (1). Hence, (II) follows from 


(iii). 


THE JOHNS HOPKINS UNIVERSITY. 


REFERENCES. 


{1] P. Hartman, “The L*-solutions of linear differential equations of second order,” 
Duke Mathematical Journal, vol. 14 (1947), pp. 323-326. 

and A. Wintner, “ An oscillation theorem for continuous spectra,” Pro- 
ceedings of the National Academy of Sciences, vol. 33 (1947), pp. 376-379. 

{3] H. A. Kramers, Grundlagen der Quantentheorie, Leipzig, 1938. 

(4] C. R. Putnam, “On the spectra of certain boundary value problems,” American 
Journal of Mathematics, vol. 71 (1949), pp. 109-111. 

[5] S. Wallach, “ The stability of differential equations with periodic coefficients,” 
Proceedings of the National Academy of Sciences, vol. 34 (1948), pp. 203-204. 

{6] H. Weyl, “ Ueber gewéhnliche Differentialgleichungen mit Singularitaéten und die 
zugehérigen Entwicklungen’ willkiirlicher Funktionen,” Mathematische 
Annalen, vol. 68 (1910), pp. 222-269. 

[7] A. Wintner, “(Z?)-connections between the potential and kinetic energies of linear 
systems,” American Journal of Mathematics, vol. 69 (1947), pp. 5-13. 


[2] 


[8] , “Asymptotic integrations of the adiabatic oscillator,” ibid., vol. 69 (1947), 
pp. 251-272. 
[9] , “On the location of continuous spectra,” ibid., vol. 70 (1948), pp. 22-30. 


| 


CONTINUOUS DECOMPOSITIONS.* 


By G. T. WHypurn. 


1. Introduction. The decomposition of a region in the complex plane 
into sets f-'(w) effected by a single valued function f(z) analytic in this 
region is lower semi-continuous but not necessarily upper semi-continuous 
[1,2,3,4,5]+* in the stronger or open set sense. In other words, the sum 
(union) of all elements intersecting any open set is open but the sum of all 
elements contained in an open set may be neither empty nor open. Indeed 
it is readily seen that if we take the region to be the whole plane, no entire 
transcendental function can generate a decomposition upper semi-continuous 
in this sense. Of course the decomposition generated by a rational function 
is u.s.¢. In this sense, because the function is extensible to the compact plane 
(complex sphere) and, for compact spaces, open set upper semi-continuity 
is equivalent to the weaker limit sense upper semi-continuity which always 
holds for the decomposition generated by any continuous mapping. 

Since the stronger open set continuity of a decomposition usually limits 
the decomposition to one having compact elements, whereas the transcendental 
entire functions never yield such a decomposition, it seems desirable to 
develop a theory for decompositions into non-compact elements. Further, 
there is need for the introduction of a weaker type of continuity which will 
hold for the decompositions generated by significant classes of analytic 
functions and from which important consequences can be drawn. This in 
brief is the objective of the present paper. 

The term topological space will be used in the Hausdorff sense, i. e., 
it is a space satisfying Hausdorff’s three fundamental neighborhood or open 
set axioms [6] plus the weakest separation axiom (for any two distinct points 
x,y there is an open set containing z but not y). When we have a metric 
space, distances will be denoted by p(a,y) and circular neighborhoods with 
center x and radius r by V,(z). 

A mapping f(A) =B is a single valued continuous transformation of 
A onto B. Such a mapping is interior [7,8] or open provided the image 
of every open set in A is open in B, and closed if the image of every closed 
set in A is closed in B. 

Let S be a topological space and let G be a decomposition or partitioning 


* Received June 11, 1948. 
1 Numbers in brackets refer to the bibliography at the end of the paper. 


218 


CONTINUOUS DECOMPOSITIONS. 219 


of S into disjoint closed sets. For any set X in S let X, denote the sum of 
all elements of G intersecting X. We consider the conditions: 
(a) The sum of all elements of G intersecting an open set is open 
(i.e., U open implies Ug open) 

(b) The sum of all elements of G intersecting a closed set is closed 
(F closed implies F, closed) 

(b’) The sum of all elements of C intersecting a compact set 1s closed 
(kK compact implies Ky closed) 

(b”’) Any element of G intersecting the limit inferior of a sequence of 

elements of G contains the limit superior of this sequence. 

(1.1) THrorem. Jf S is weakly separable? then (b’) and (b”) are 
equivalent. In particular, (b’) and (b”) are equivalent when S is perfectly 
separable. 

Proof. Suppose (b’). Then let gi, g2,- - - be a sequence with rege G 
and elim gy. Let yelimg,. It follows from our hypothesis on § that 
there exists a sequence with 2, €g, and so that in the sense 
that every open set about x contains almost all the points 2,. Then if 


oO 
X =x -+- > %n, where r is chosen so that ynone gn for n=r, X is compact. 
Accordingly X, is closed. Thus ye X,; and since ynone gn, n=r, and 
—— 
Xg=9+D9n, we must have ye g. Whence g~—- lim gn. 
r 


Suppose, on the other hand, that (b”) holds and let K be any compact 
set in S. If x were a limit point of K, not belonging to Ky, we could find 


an infinite sequence of distinct points 2, %2,- + - in Ky converging to z. 
Then if are the elements of G containing 2, %2,° - - respectively, 


lim gn x. However, the element g of G containing x does not intersect K 
(since 2 non e K,), whereas lim g,: K 0 since K is compact and K-g, 40 
for all n. 


2. Semi-closed mappings. A mapping f(A) = B will be called semi- 
closed provided the image of every compact set in A is closed in B, it being 
understood that A and B are topological spaces. Of course, the image of 
every compact set is compact but it may fail to be closed unless further 
restrictions are put on the image space B or on the mapping. This is brought 
out sharply in the following: 


? That is, satisfies the first Hausdorff countability axiom that for each peS there 
exists a monotone decreasing sequence U, -) U,—)- - - of open sets closing down on p 
in the sense that for any open set U in S containing p, we have p CC U,, CU for some n, 


220 G. T. WHYBURN. 


EXAMPLE. There exists an interior mapping f(A) =B of a compact 
metric space A onto a compact and countable but non-regular topological 
space B. 


For let A consist of two disjoint sequences of real numbers (zn) and 
(yn) converging to distinct limits a and 6 with the real number system metric. 
Let B consist of points a’, b’, 21, 22, 23° - * ; and in B let the empty set, 


sets of the types (zn), a’ + D2, b’+ > zi (for all n), and any set which is 


the sum of sets of this type be open sets. It is then readily verified that A 
is compact and metric and B is a topological space. Also B is compact since 
every infinite set of points in B has both a’ and 0’ as limit points. However, 
B is not regular, since the closure of every open set containing a’ also contains 
b’. Finally, if we define f(an) =f(yn) =2n (n—1,2,:--), f(a) =a, 
f(b) =0’, f maps A continuously and interiorly onto B. This mapping is 


not semi-closed, because the compact set a + > a, maps onto the set a’ + ¥ zy 
1 1 


which is compact but not closed as 0’ is a limit point of it. 


(2.1) THeorem. Let f(A) = B be an interior and semi-closed mapping, 
where A and B are topological spaces. If A is perfectly separable, so also 
is B. If A ts regular and locally compact, so also is B. If A is locally 
compact, separable and metric, so also is B. 


Proof. To prove the first statement, let R,, R2,- - > be a fundamental 
sequence of open sets (basis) in A. Then f(R:),f,R2),° * - is a fundamental 
sequence of open sets in B. For if ze B and U is any open set in B 
containing x, f-*(U) is open and hence contains a set R» which intersects 
f(x). Accordingly xe f(Rn) CU. 

For the second statement, let ze B and let U be any open set in B 
containing 2. Since f*(U) is open, there exists an open set V such that 
V-f*(x) ~0 and V is compact and contained in f1(U). Then f(V) is 
open, and contains z and f(V) Cf(V) CU, since f(V) Cf(7) and f(7) 
is closed. 

The final statement is an immediate consequence of the two preceding 
ones. 

(2.11) Corottary. Given f(A) =B interior, where A and B are 
topological spaces, if A is perfectly separable, so also.is B. 


In any weakly separable topological space satisfying the axiom that 
two distinct points lie in disjoint open sets, every compact set is closed. 
Thus in particular in a regular topological space every compact set is closed. 
Whence 


ary 


( 

| 


CONTINUOUS DECOMPOSITIONS. 221 


(2.2) THrEoREM. Any continuous mapping f(A) =B of a topological 
space A onto a weakly separable regular topological space B is semi-closed. 

(2.3) THrorem. If A is locally compact, separable and metric and 
f(A) == B is interior, where B is a topological space, a necessary and sufficient 
condition that B be locally compact, separable and metric is that f be semi- 
closed. 

Proof. By the above theorem, separability and regularity of B imply 
that f is semi-closed. On the other hand, if f is semi-closed, by (2.1) B is 
locally compact, separable and metric. 


3. The decomposition space. Natural mapping. Let G be a decom- 
position of a topological space S§ into disjoint closed sets. We set up a 
decomposition or hyperspace S’ as follows. The elements of G become the 
points of S’, and a set in S’ is open if and only if the sum (union) of the 
corresponding elements of @ is an open set in S. It follows at once that if 
G satisfies (a) a set in S’ is open if and only if the sum of the corresponding 
elements of @ is the set X, for some open set X in S. For convenience we 
also define any open set S’ containing 2’ e S’ to be a neighborhood of 2’. 

For convenience of reference we state next the well known 


(3.1) THroreM. The decomposition space S’ is a topological space. 
For a proof of this theorem the reader is referred to [2], p. 61. 


Definition. If for each xe S we define ¢(x) to be the point in 8’ given 
by the element of G containing x, we obtain the natural mapping of S onto S’ 
generated by G. 


It follows directly from the definition that this mapping ¢ is continuous 
if condition (a) is satisfied, as is well known. Also, in this case, we have 


(3.2) THrorEM. The natural mapping o generated by a decomposition 
G satisfying condition (a) is intervor. 


For, if U is any open set in 8, ¢(U) is the set of points in 8’ given by 
elements of G intersecting U and thus ¢(U) is open in 8’.. Thus from (2. 11) 
we have 


(3.21) Corottary. If S is perfectly separable so also is 8’. 


(3.3) THEoREM. The natural mapping generated by a decomposition 
G of S satisfying conditions (a) and (b’) is also semi-closed. 


For let K be any compact set in S and let H=¢(K). Then if Ky 


ct 
al 
1d 
t, 
is 
A 
8 
is 
| 
0 | 


222 G. T. WHYBURN. 


denotes the sum of all elements of G intersecting K, S— Kz, is open (since 
Kg is closed). Accordingly ¢(S — K,) = S’ —H is open, so that H is closed. 
Thus from (2.1) we have 
(3.4) THrorem. If the decomposition G of a locally compact separable 
metric space § satisfies (a) and (b’), the hyperspace S’ is locally compact 
separable and metric and the natural mapping $(S) =S’ is continuous, 
interior and semi-closed. 


4. Decompositions generated by mappings. We next consider the 
converse situation in which we start with an interior mapping and study the 


natural decomposition generated in the range space. 


(4.1) THroreM. Let f(A) =B be an interior mapping, where A and 
B are topological spaces, let G be the decomposition of A into the sets 
[f-"(y) lven, let A’ be the hyperspace of this decomposition and $(A) =A’ 
the associated natural mapping. Then ¢ is interior. Further, the mapping 
fo is 1—1 and interior and thus is a homeomorphism of A’ onto B. 


To show that ¢ is interior it suffices in view of (3.2) to show that the 
decomposition G satisfies condition (a). To this end let U be any open set 
in A. Then since U, is precisely the set f-*f(U), it follows that Uy is open 
since f(U) is open by interiority of f and f*(U) is open by continuity of f. 

To see that fo is interior, let V be any open set in A’. Then ¢1(V) 
is open by continuity of ¢ and f¢*(V) is then open in B by interiority of f. 

(4.11) Corottary. If B is regular and weakly separable, so also is A’; 
and the mappings f and ¢ are semi-closed. 


Now if we denote the homeomorphism f¢™ by h, it results at once that 
for each ze A 
ho (x) = =f (2). 
Whence, 


(4.2). The natural mapping ¢ generated by the decomposition given 
by an interior mapping f(A) =B is topologically equivalent to f. 

Thus the cycle closes when we begin with an interior mapping, take the 
decomposition of the range space, set up the decomposition space and take 
the natural mapping of the decomposition, in the sense that we get back a 
mapping topologically equivalent to the one we started with. This is not 
true in general with merely continuous mappings on non-compact spaces. 

Inasmuch as decompositions generate natural mappings of the original 
space onto the hyperspace, and mappings generate natural decompositions 
of the original space, and these two processes are equivalent in a very real 


we 


CONTINUOUS DECOMPOSITIONS. 223 


sense, it would be possible to dispense with the study of decompositions and 
just study mappings or conversely. However, both viewpoints have proven 
most fruitful in the past (See [1, 2,3,4,8]). Hach has contributed materially 
to the development and enrichment of topological and function-theoretic 
results, due to the fact that some relationships reveal themselves intuitively 
in terms of decompositions while others show up more naturally as mapping 
theorems. Thus the two viewpoints have played a complementary role in 
appealing to mathematical intuition; and it seems reasonable to suppose they 
may continue to do so. 


5. Upper semi-continuity in the open set sense. A decomposition 
of a topological space S into disjoint closed sets satisfying condition (b) is 
u.s.c. in the open set sense. As originally defined by Moore [1], Alexandroff 
[2] and others, G is u.s.c. provided that if ge G@ and U is any open set 
about g, there exists an open set V about g such that any element of G 
intersecting V lies in U. Clearly the two definitions are equivalent to each 
other and equivalent in turn to the condition that the sum of all elements 
of G contained in any given open set be open (possibly empty). It is clear 
that in a compact metric space, condition (b), (b’) and (b”) are equivalent. 
Also [8] in a locally compact separable metric space, (b) and (b’) are equi- 
valent provided the elements of G are continua. 

That condition (b), in the presence of (a), tends very strongly to limit 
the decomposition to having compact elements and thus is too strong for our 
purposes is shown by 

(5.1) THeroreM. If G is a non-degenerate decomposition of a connected 
separable metric space into disjoint closed sets satisfying conditions (a) and 
(b), the elements of G are necessarily compact. 


Proof. This theorem follows from a theorem of A. D. Wallace [9]. 
For it is readily verified that the conditions in Wallace’s theorem follow from 
conditions (a) and (b). 


6. Continuity condition for locally connected spaces. We next con- 
sider the following conditions for a decomposition @ of a topological space S 
satisfying (a). 

(c) For any open set U in S which is the sum of the elements of a 


subcollection of G corresponding to a connected set in S’ and any component 
Q of U, we have Q,= U. 


(c’) For any region R in the decomposition space S’, each component 
of the inverse of R maps onto R under the natural mapping ¢. 


G. T. WHYBURN. 


(c”) For any ye S’ and any open set U in S’ containing y, there exists 
a region R with yO RCU such that each component of the inverse of R 
maps onto R under ¢. 


Note. <A region is a connected open set. 


It is clear that (c’) is merely a restatement of (c) in terms of the 
natural mapping. Hence (c) and (c’) are equivalent in any topological 
space. 


(6.1) THrorEM. If S ts a locally connected topological space, then 
for any decomposition G of S into disjoint closed sets satisfying (a), 
conditions (c), (c’) and (c”) are equivalent. 


It suffices to show that (c’) and (c”) are equivalent; and since (c’) 
obviously implies (c”) we have left to show that (c”’) implies (c’). 

To this end let R be any region in B and Q any component of ¢1(R). 
Since ¢(Q) is open and ¢(Q) CR, if $(Q) AR there would exist a point 
ye¢(Q)[R—¢(Q)]. Let EF be a region satisfying yC ECR and such 
that any component of maps onto under ¢. Since y, 
there exists a ze¢(Q)-:H. Let we Q-¢"(£) and let H be the component 
of ¢*(£) containing w. By hypothesis ¢(H7)=—HOy. But since 
¢(H) =ECR, we have HC ¢1(R); and sinee and Q is 
a component of we must have Q H and therefore ¢(Q) ¢(H) 
=F. y, contrary to ye R—¢(Q). 


7. Continuity in the open set sense. A decomposition G of a top- 
ological space S into disjoint closed sets satisfying (a) and (b) will be 
called continuous in the open set sense. That is, G is continuous in the 
open set sense if for any open set U in S the sums U,y of all elements inter- 
secting U and U, of all elements contained in U are open sets. 


(7.1) THeorem. In a locally connected compact separable metric space 
S, any decomposition G of S satisfying (a), (b’) and (c) is upper semi- 
continuous in the open set sense at each compact element of G. 


This means that if X ¢ G is compact and U is any open set containing 
X, there exists an open set V such that any element of G intersecting V lies 
in U. By local compactness of S we may suppose U so chosen that U is 
compact. We shall now show that the sum U, of all elements of G contained 
wholly in U is an open set so that it may be taken as our V. To this end let 
Y be any element of G contained in U. Let F(U) denote the boundary 
U —U of U, let ¢ be the natural mapping of G and let FR’ be the component 
of S’— ¢[F(U)] containing ¢(Y). Since each component of inter- 
sects Y, by (c), whereas ¢?(R’) -F(U) —0, we have 


224 


CONTINUOUS DECOMPOSITIONS. 


CU. 


Since ¢1(R’) is open and contains Y, it follows that the sum U, ofall 
elements of G@ lying wholly in U is open. 


(7.2) THrorem. Jn order that a non-degenerate decomposition G of a 
connected, locally connected, locally compact separable metric space S into 
disjoint closed sets be continuous in the open set sense, it is necessary and 
sufficient that the elements of G be compact and satisfy conditions (a), (b’) 
and (¢). 


To prove the necessity, suppose G is continuous in the open set sense, 
i.e., that it satisfies (a) and (b). Then by (5.1) the elements of G are 
compact. Condition (b’) follows from (b) by (1.1) since (b) implies (b”) 
in any separable metric space. To show (c), by (6.1) it suffices to establish 
(c’). To this end let ye S’ and let V be any open set in S containing ¢*(y) 
and such that V is compact. By (b) the sum V, of all elements of G con- 
tained in V is open and hence ¢(Vo) is open by interiority of ¢. Now 
if U is any given open set in S’ containing y we have ye U-¢(Vo); 
and if R is any region in S’ satisfying yC RCU-¢(V,) and Q is any 
component of ¢*(/), we shall show that ¢(Q) =F. This results from the 
fact that QC V,C V and hence Q is conditionally compact. For if 
$(Q0) AR, since ¢(Q) is open, there would be a limit point p of ¢(Q) 
belonging to R—¢(Q). Then if + is a sequence in ¢(Q) con- 
verging to p and 2,¢Q- (pn), there would exist a limit point a of (an) 
in But by continuity of ¢, =p so that reg C and 
this gives xe Q since Q is a component of ¢7(R). Accordingly ¢(Q) =P 
and (c) is established. 

The sufficiency of the conditions is a direct consequence of (7.1). 


In connection with (7.2) it may be of interest to remark that in case 
A is compact (i.e., A is a locally connected continuum) compactness of the 
point inverses results from the continuity of f and condition (b’) is equi- 
valent to (b). Thus (a) and (b’) are equivalent to (a) and (b), and we 
obtain (c) as a necessary consequence of (a,b). This latter fact is known 
(See (7.4), p. 147 of [8]). The property (c) may be used to characterize 
the quasi-monotone mappings on locally connected continua, as shown by 
Wallace (See p. 152 of [8] for example). 


8. Conclusion. Our results indicate that in spaces which are locally 
connected generalized continua (= locally compact, separable, metric and 
connected), conditions (a), (b’) and (c) provide limitations on a decom- 


15 


225 

t 

t 

) 

e 

e 

is 

of 

y 

t 

T- 

|_| 


226 G. T. WHYBURN. 


position which it would be reasonable to call continuity of the decomposition. 
Surely open set continuity is too strong if we wish to make more effective 
contact with analytic functions. For it is now clear that the decomposition 
generated in the complex plane by no transcendental entire function can be 
continuous in the open set sense, since this would imply compactness of the 
elements whereas at most one of them could be compact by the Picard 
Theorem. Further, in case the elements of the decomposition are compact, 
conditions (a), (b’) and (c) reduce to open set continuity as shown in (7. 2). 

On the other hand, the decomposition effected in the complex plane by 
many transcendental entire functions will satisfy these conditions. For 
example the exponential function e*, sinz, or indeed any periodic entire 
function, generates a decomposition continuous in this (a,b’,c) sense. Also 
it will be shown in another paper that the decomposition in the complex 
plane effected by any entire function of order < 1/2 is continuous in this 
sense. In the same paper significant consequences will be established con- 
cerning mappings which generate (a,b’,c) continuous decompositions. For 
example, a function such as ze* or (2—a), (2—ax)e* would 
not generate a continuous decomposition in this sense because these functions 
vanish at only a finite number of points but take other values infinitely 


many times. 


UNIVERSITY OF VIRGINIA. 


BIBLIOGRAPHY. 


1. R. L. Moore, Foundations of Point Set Theory, American Mathematical Society 
Colloquium Publications, vol. 13 (1932). See references therein to earlier 
papers on semi-continuous collections by Moore and others. 

2. P. Alexandroff and H. Hopf, Topologie I, Berlin, 1935, Springer. 

3. L. Vietoris, “ tber stetige Abbildungen einer Kugelfliche,” Proceedings, Akademie 
van Wetenschappen, Amsterdam, vol. 29 (1926), pp. 443-453. 

4. C. Kuratowski, “Sur les décompositions semi-continues d’espaces métriques com- 
pacts,” Fundamenta Mathematica, vol. 11 (1928), pp. 169-185. 

5. R. Vaidyanathaswamy, Treatise on Set Topology, 1947, Indian Mathematical Society. 

6. F. Hausdorff, Mengenlehre, 1927, Berlin and Leipzig. 

7. S. Stoilow, Lecgons sur les Principes Topologiques de la Théorie des Fonctions 
Analytiques, 1938, Paris. 

8. G. T. Whyburn, Analytical Topology, American Mathematical Society Colloquium 

Publications, vol. 28 (1942). 
D. Wallace, “Some characterizations of interior transformations,” American 
Journal of Mathematics, vol. 61 (1939), p. 761. 


9. A. 


QUASI-MONOTONE SERIES.* 


ne By Tomurinson Fort. 

le 

t, Monotone series, that is, series of the type a4, +a2.+:--+a+::- 
). where Qi = Gni1 > 0, have been extensively studied. Szaisz [this JourNAL, 
ry vol. 70 (1948), p. 203] generalizes the idea of monotoneity to series with 
or 

(1) 0 < don S (14+ @/n) ay, a= 0. 

- For series of this type he establishes the “ Cauchy condensation test” and 


the “ Cauchy integral test” for convergence. In the present note the idea 
; g g p 


” of monotoneity is further generalized not only in the domain of reals but 
si also in the domain of complex numbers. 

i We consider the series 

ld 


where Ynt. We assume tn =0, yn=O and let tn yn. 
We then assume 


(3) tno + = 2tn(1— 1/bn) 
and 

where 


1 < bn S and 0 < Bu SS Bua. 


ty 
er 
Under these circumstances! we call (2) quasi-monotone in the mean. 

‘ie 1. We prove the following lemma: 
+ (1—1/bn) (1 —1/dn-1) -(1— 1/bm1) J, n>m. 
n 
m | * Received June 28, 1948. 

1 Relations (3) and (4) can easily be generalized so as to include a larger number 
am | Of terms. It is also true that for certain results it is not necessary to assume both 


(3) and (4). This is true of Theorem I. 
227 


n. 


TOMLINSON FORT. 


(oa) 


Proof of Lemma I. We establish the following relation which holds either 
for or for §=1. 


(6-) + + (1+ 1/bn) (1—1/bna) - 
+ (1—1/bn) (1 —1/bn1) (1 
+ 8(1—1/dn) (1 (1 —1/bm) ]. 
Relation (3) implies 
+ tna = (1—1/bn) tn + (1 —1/bn) (1 — 1/bn-1) tn. 
If = (1—1/)n) tn, we write 


(7) trate: 2 + (1—1/bn)] + 
+ 

Tf < (1—1/0n)tn, then necessarily 

(8) tno > (1—1/bn) (1 —1/dn-1) tn 

and we write 


tn + 8tma 
(9) Stal + (1—1/bn) + (1-1/8) (1— 
+ tas + tra + 
Next substitute in (7) for tn. the expression (1—1/On1)tn+ if ths 


= (1—1/bn-1)tn+ and then for as above. If (1—1/Ddn-1) tna, 
we substitute for ¢,2 + tn-3 the expression 


[(1—1/dn-+) + (1 —1/dn-1) (1 — 1/0n-2) 


and then (1—1/bn)tn for tn. We substitute similarly in (9) for ¢n_3 or 
tn-s + tn, and then for tn. from (8). The value 1 or 0 is assigned to the 
symbol & according as the term tm, can or can not be used in the last step 


of the process. 
Let §=1 in the left-hand member of (6) and §=0 in the right-hand 


member and we have (5). 
We now state without proof the following lemma. 


Lemma II. tm + 


S tm[1 + (14+ 1/Bn) + (1+ 1/Bm) (1 + 1/Bst) 
+ (141/Bm)(1+1/Bma) (1 m<n. 


QUASI-MONOTONE SERIES. 229 


TuHeorEM I. Let (2) be quasi-monotone in the mean and convergent. 
Let m be an integer less than n but a function of n such that 
m—> oc whenn—o and [(n—m)/bn] >> 0. 
Then anbm— 0. 
Interesting special cases of this theorem are: b, =n*%, O0< a1. 
m= [n/2].2 We then have n%a,—>0. Similarly, if bn = log n we see that 
(log 2) an — 0. 


Proof of Theorem I. Choose M so that when n> M 
| Am-1 + Ain an | = €. 


Then, using Lemma J, 


Qe = tna 
& 4 (1 — 1/6.) + (1 
= + (1—1/bm) + (1—1/bm)? ++ 
= tnbm[1 — (1 — = thbm[1 — {(1 —1/dm) 
= tnbmpn, where pn > € > 0. 


Hence 0. But tr =| adn]. Hence > 0. 

Next let us be given a sequence of integers g; such that 
(10) 9k+2 — Jeu S Gurr — 9x), 
where LZ is a constant, and such that 


THeEorEM II. A necessary and sufficient condition that the series (2) 


which is assumed quasi-monotone in the mean, converge is that > (gus: — gu) do, 
converge. 


Proof. If series (2) converges it converges absolutely. 
Next by Lemma 1 and the fact that b»., = bn we see that 


—{(1— 1/by,) bax} /Dox | 


* [n/2] denotes the largest integer not greater than /2. 


(11) — gx) <M, [ — 9x) <M. 
e 
d 


230 TOMLINSON FORT. 


Similarly, 


| ay, + toner + + | = to, + toner + 
S ty + (1+ 1/Bo) + (14+1/By) (1 +1/Bo,.-1) ] 
= ty, Bo, [1 { (1 1/By,) Box} /Bor 
= Cty, Bo, = | Ag, | Bo,. 

The theorem follows by virtue of these relations and of (10) and (11). 


We next proceed to a generalization of the Cauchy integral test. Let 
b(a) be a real monotonic increasing function defined when « = 0 and always 


greater than 1. 
We assume a function a(a) = 2(a) + iy(«) and let t(a@) = x(a) + y(a). 
We call a(a) quasi-monotone in the mean if x(a) 20, y(@) 20 and 


(12) t(a +B) + t(a+ 841) = 
so long asa=0 and0O=B=1. We write (12) for brevity 
t(a +8) + t(a+B+1) P(a)t(a+2). 


All functions are assumed integrable in the Riemann sense. 


THEOREM III. A necessary and sufficient condition that S:a(a) con- 


a=1 
verge 1s that f a(a)da converge. 
1 


Proof of Theorem III. We have 
> P(a)t(a+2) =P(1) St(a+2). 
Next let «+ and 1—fB=r. Then 


t(o) + t(o +1) 


Hence 


(14) Stade s to) 


The theorem follows from (13) and (14) and the relation, 


(a) = | a(a) | S242). 


THE UNIVERSITY OF GEORGIA. 


SOME THEOREMS ON THE DIMENSION OF FIBRE SPACES.* 
By S. D. Liao. 


The aim of this paper is to establish some theorems which are concerned 
with the relationship between the dimensions of a fibre space, its base space, 
and its fibres. The notion of a fibre space is understood in the sense of 
Hurewicz-Steenrod [1]. Following their paper we shall make consistent 
use of the notations: 


X = fibre space, o being its metric; 
B= base space, p being its metric; 
ax = projection of Y on B; 


¢(x,b) is the slicing function, so that ¢(2,b) is defined for re X, 
be B, with < 


We shall always suppose that both Y and B are separable. 


Let T be a separable metric space and S a subset of 7. The least 
dimension of the open subsets W of TJ with W-S will be denoted by 
dim (8,7). With this notation we shall state the theorems which we intend 
to prove in this paper as follows: 


THEOREM 1. Let a separable metric space X be a fibre space over a 
polyhedron B with compact fibres. Then for any point b of B, 
I) dim (b, B) + dima 1(b) S dim X). 


THEOREM 2. Let a finite dimensional compactum X be a fibre space 
over a polyhedron B. The subset B* of B which consists of all points b of B 
for which dim (b, B) + dima*(b) = dim (71(b),X) holds, contains an 
open dense subset of B. 


THEOREM 3. Under the same hypotheses as Theorem 2, let k denote 
the greatest dimension of the fibres of X. If B has homogeneous dimension, 
then n+k—=—m, where m=dimX, n=—dim B. 


THEOREM 4. Let a compactum X of dimension m > 0 be a fibre space 


* Received January 17, 1948. 


231 


232 S. D. LIAO. 


over a polyhedron B of dimension n. If X possesses the following property: 
J) Any compact neighborhood retract T of X of dimension < m which 
disconnects X has non-vanishing (m—1)-dimensional Cech homology group 


then: (i) all the fibres of X have the same dimension m —n, and (ii) B is a 
strongly connected homogeneous n-dimensional polyhedron. 


THEOREM 5. An irreducible closed polyhedron has the property J of 
Theorem 4. Hence the conclusions of Theorem 4 hold, if it is a fibre space 


over a polyhedron. 


Theorems 1, 2 and 3 give the relationship between the dimensions of 
X, B and z*(b), while Theorems 4 and 5 give sufficient conditions that the 
fibres have the same dimension. 


1. Proof of Theorem 1. 


Lemma 1. Let a metric space X be a fibre space over a metric space B 
with compact fibres. Let b be an arbitrary point of B and € an arbitrary 
positive real number. There is a positive real number &<€ such that 
for any point (a,u) of X XB with xew'(b) and 
p(0, u) < é. 


Proof. Suppose the contrary. There is a sequence {& } of positive real 
numbers €, < €9, which converges to 0, and there is a sequence {(2n, Un) } 
of points of X B with a,em*(b) and p(b,un) < én such that 
Un), tn) n=—1,2,3,---. Since w1(b) is a compactum, we 
can, by changing to a subsequence if necessary, assume that the sequence {21} 
in 71(b) converges to a point x of 7*(b). Then we have lim $(2n, un) 


= (2%, b) and lim o(¢(#», Un), = 0 which contradicts the inequalities 
o($(@n, Un), tr) n=1,2,3,---. This proves Lemma 1. 


Proof of Theorem 1. Clearly there is an open subset Uy of X with 
1(b) and dim Uy = dim X). There is a positive real number 
£ such that U, contains the ¢-neighborhood of w1(b). Let €<e be the 
number corresponding to ¢ and b according to Lemma 1. We decompose the 
polyhedron B into a (locally finite) complex K of mesh < € (which means 
that every cell of K is of diameter < é). Let IM be the closed star of 6 with 
respect to K. It follows that ¢(v,u)eU, for all rewr'(b), we M, or by 
setting P=72"(b) X M, that ¢(P) C Up. 


19f is the group of the real numbers modulo 1. 


SOME THEOREMS ON THE DIMENSION OF FIBRE SPACES. 233 


Since M is a polyhedron and 21(b) is a compactum, we have? 
dim P = dimz'(b) + dim dim M—=dim (b,B), to prove the in- 
equality I, it is sufficient to prove that dim P = dim U,. 

Let « be an arbitrary real number. Clearly P is a compactum and the 
partial mapping ¢|P of P into X is uniformly continuous. Hence there 
exists a number 8>0 such that o(¢(2, U)) <€/2 whenever 
p(t, U2) <8, where (2, U2) We choose an 0 << so that 
np(u,b) <8 holds for any we M. Then we construct the continuous mapping 
g(x, U) = o(2, nu + (1—7)b), (a, u) e P, and assert that g is an e-mapping 
of P into In fact, let g(a, =g(x2, uz). We get nu. + (1—n)bd 
= + (1—~7)bd, and hence u, Thus, 


+ Vo(a1, + So(X1, U1) ) + g U2) ) 
< 


It follows that for any given e >0 there is an e-mapping of P into Uo. 
Therefore * dim P = dim Uy, and Theorem 1 is proved. 


Remark 1. Theorem 1 does not remain true if the polyhedron B is 
replaced by a general separable metric space. In fact, Pontrjagin* has 
defined compacta B, and B, of dimension 2, whose product space B, X B, 
is of dimension 3. Considering B, X B, as a fibre space over B,, the inequality 
I no longer holds. 


Remark 2. Without the assumption that the fibres are compact, we can 
establish the following facts: 1) dim (b, B) S dim (w1(b), X) for any be B; 
2) dimB=dimX. To prove 1), let U, be an open subset of XY with 
U,71(b) and dim Uy, = dim (7 1(b),X), and let A topo- 
logical mapping g in X of the eo-neighbourhood V(b, «,) of b in B can be 
defined by g(u) = u), we V(b,e0). Let = Uo [1 (V(b, Then 
dim = dim U,. Clearly g-'(U’,) is an open subset of B, containing b, 
so that dim (b,B) =dimg+(U’,). This proves 1). 2) is also easily 
established. 


*For a non-empty compactum Y and a non-empty polyhedron Z, the relation 
dim (Y X Z) =dim Y + dim Z is easily deduced from the well known fact [2, p. 34] 
that the dimension of the product space of a compactum and a 1-dimensional separable 
metric space is the sum of the two factor spaces. 

5 This follows from an argument given in [3], pp. 364-365. 

4L. Pontrjagin, “Sur une hypothése fondamentale de la théorie de la dimension,” 
Comptes Rendus, vol. 190 (1930). pp. 1105-1107. 


= 


234 Ss. D. LIAO. 


2. Proofs of Theorems 2 and 3. 


LemMMaA 2. Let a metric space X be a fibre space over a metric space B 
with compact fibres and n, an integer =0. The set B, of points u of B with 
dim r1(u) = n ts open in B. - 


Proof. It suffices to prove that the set B’, of points uw of B with 
dima *(u) <n is closed in B. Let be B be a limit point of B’, and & an 
arbitrary positive real number. Let € < €) be the number corresponding to b 
and £/2 according to Lemma 1. There is a point uo of B’n with p(b, uw) < & 
Hence o(¢(2, 2) < €/2 for any wew'(b). We define the continuous 
mapping g of into by g(x) = U0), If a and 
are points of 1(b) with = g(2z), then o(21, 22) Uo) 
+ Uo), 22). This shows that g is a mapping in of the 
compactum 21(b). Since we Bn, dima't(u) <n. It follows that® 
dim 71(b) <n and hence be B’,. This proves Lemma 2. 


Proof of Theorem 2. X being a compactum, the projection 7 is a closed 
mapping of X on B. Let C be any subset of B and let D—=-x(C). The 
partial mapping z| D will also be a closed mapping of D on C. 


We take a cellular decomposition K of the polyhedron B. Let By be the 
set of all points of B, each of which is interior to a ground cell of K, and 
let B’ = By (| B*. To prove Theorem 2, it is sufficient to prove that B’ 


is open and dense in B. 
Let V be any non-empty open subset of B. By being open and dense 
in B, V’=V{) By is a non-empty open subset of B. It follows that there 


is a point b of B such that 

(1) be V’, and dimw1(b) for any we V’. 
Let V, be a neighbourhood of b in B such that 

(2) V.C V’, and dim V, = dim (0, B). 


Let U) =721(V.). Then U, is an open subset of Y containing 71(b), and 


we have 
(3) dim U, = dim (71(b),X). 


Since the partial mapping | U, is a closed mapping of Uo on Vo, we con- 
clude by making use of a well-known theorem [2, p. 92], that 


SOME THEOREMS ON THE DIMENSION OF FIBRE SPACES. 
(4) dim V, + dimz*(b’) = dim U, for a certain b’e Vo. 


Combining the inequalities (1)-(4), we get dim (6, B) + dimw(b) 
= dim (71(b),X). On the other hand, by Theorem 1, we have dim (0, B) 
+ dim =dim (7*(b),X). It follows that be B*, whence be B’. 
Thus any non-empty open subset of B contains a point of B’. B’ is therefore 
dense in B. | 

To prove that B’ is open in B, we take an arbitrary point be B’. Then 


(5) dim (b, B) + dim w1(b) = dim (r1(b), X), 


and b belongs to the interior G of a ground cell of K. Evidently, G is an 
open subset of B, contained in By, and 


(6) dim (u,B) =dim (b, B) for any we G. 

By Lemma 2, we have a neighborhood V of b in B such that 

(7) V CG, and dima?(u) 2dimw(b) for any 
Clearly, there is an open subset U of X such that 

(8) CU Cw(V), and dim (r1(b), =dim U. 


Since z is a closed mapping of XY on B, we easily establish that the set 
V’ = B—-zx(X—U) is an open subset of B, containing b, and that 


(9) VCY, r(V’) CU. 


Now let b’ be any point of V’. Then 21(b’) Cr1(V’) CU and dimU 
= dim (71(b’), X). From (5)-(9), it follows that dim (b’, B) + dim w1(b’) 
= dim (771(b’), X). But by Theorem 1, we have dim (b’, B) + dim x 1(b’) 
S dim (771(b’), X). Therefore b’e B* and hence V’C B*. It follows that 
V’C VC B* GC B* = Thus we have shown that any 
point b of B’ has a neighbourhood V’ in B, contained in B’. B’ is therefore 
open in B. Theorem 2 is thus proved. 


Proof of Theorem 3. Clearly, dima *(u) = for any we B and there 
is a point 6 of B with dim z1(b) =k. Since B is of homogeneous dimension 
n, we have dim (b, B) =n, and since z is a closed mapping of X on B, 
there is a point b’ of B such that n+ dima *(b’) =m. Thus, dim (6, B) 


236 S. D. LIAO. 


+ dima*(b) = m= dim (41(b),X). But, by 
Theorem 1, dim (0, B) + dima*(b) = dim (71(b),X). Combining these 
two inequalities, we see that n+ k =m. 


CoroLttary. Under the same hypotheses as Theorem 2, let K be a 
cellular decomposition of B and B’ a closed subpolyhedron of B built out of 
some proper sides of ground cells of K. Then dima (B’) <m, where 
m= dim X. 


Proof. Let® ¢:,¢2,: + +,¢: be the cells of K which compose the sub- 
polyhedron B’ of B. Then, B’= c, and U each 
ISisl 


a being a closed subset of XY. Suppose that 1(B’) 2m. Then, 
there is at least one c;, say c,;, such that dima *(¢c,) 2m. Clearly w*(¢:) 
is a compactum of dimension m and is a fibre space over c, with projection z, 
and c, is a polyhedron of homogeneous dimension. By Theorem 3, we have 
dim c, +k, =m where k, denotes the greatest dimension of the fibres of the 
fibre space z'(c,). By hypothesis, there is a ground cell ¢c of K with c, as 
a proper side. We have then dimc, < dime and dim ?(¢,) S dim 7*(c) 
(=m). Just as we have done for the fibre space 7*(¢:), we have for the 
fibre space w1(c) over c, dimc-+-k=m where k denotes the greatest 
dimension of the fibres of z-1(c). Clearly every fibre of the fibre space 2-1(¢:) 
coincides with a fibre of the fibre space r1(c), and hence k, =k. Thus, 
from dim c, < dimc, we deduce dim c, + k, < dime +é which contradicts 
the relation dimc, +k,=m=dime+hk. This proves dima'(B’) < m. 


3. Proofs of Theorems 4 and 5. These theorems are concerned with 
the question whether the fibres of a fibre space have the same dimension. 
This is in general not the case, as was shown by an example given by 
Hurewicz and Steenrod [1, p. 64]. Some sufficient conditions, in the form 
of Theorems 4 and 5, are, however, found to ensure that the fibres have the 


same dimension. 


We shall precede the proofs of the theorems by a number of Lemmas. 


Lemma 3. Let A, and Az be compacta of dimensions =n, and =n, 
respectively with ny 1. The Cech homology group H™*":(A, X As, R)* 
vanishes if the Cech homology group H™(A,,%#) vanishes. 


5 We use the same symbol ¢ to stand for the cell of K as well as the closed convex 
set of the polyhedron. 


h( 


TI 


ne 


d 
u 
b 
( 
ft | 
al 
th 
Si 
of 
for 
we 
of 
X, 
by 
ret 


SOME THEOREMS ON THE DIMENSION OF FIBRE SPACES. 23% 


This Lemma is essentially a generalization of Kiinneth’s theorem to 
Cech homology theory. It can be established by a fairly standard procedure 
in Cech theory. We shall omit its proof here. 


Lemma 4. Let X be a fibre space over a polyhedron B with cellular 
decomposition K, and F a subset of B which is contained in V(b,e) () M 
where V(b, €) is the eo-neighbourhood of b in B and M is the closed star of 
b with respect to K. Then the subset r1*(F) of X and the product space 
P=7"(b) X F have the same homotopy type. 


Proof. We have the continuous mapping + of w%(F) in P: 
= and the continuous mapping @ of P in 
wi(F): 6(a,u) =¢(2,u), ver(b), we F. M being the star of b with 
respect to K, any segment joining 0 and a point uw of F lies in M. We put 
f(u,t) = (1—t)u+ tb, ue F, OStS1. Since PC V(b, the slicing 
function @ is defined for all such points (a, f(u,t)) of ¥ X B with u.eF 
and 0 =¢ <1, provided that x(x) lies on the segment joining uw and b. We 
then define the homotopy g(a,#) in such that g(z,0) and 
g(v,1) = for all rew(F) as follows: 


g(a, t) = cer (F), OStS1. 


Similarly, a homotopy h(a, u,¢) is defined in P such that h(a, u, 0) = (2, u), 
h(x, u,1) =76(a,u) for all (a, u) eP as follows: 


h(x, u, t) = ($(¢(2, f(u,1—?t)), b), u), u) e P, 
These homotopies prove Lemma 4. 


Lemma 5. Let X be a fibre space over a metric space B, and C a compact 
neighbourhood retract of B. Then x1(C) is a neighbourhood retract of X. 


Proof. C being a neighbourhood retract of B, there is an open subset V 
of B with C C V and there is a continuous mapping f of V in C with f(w) = u 
for all we C. Making use of the fact that C is compact, it follows from a 
well known argument that there is a positive real number 6 such that 
p(f(u),u) << for all we V () V(C,8) where V(C, 8) is the 8-neighbourhood 
of Cin B. Put V’=V{) V(C,8). 2*(V’) is obviously an open subset of 
X, containing 71(C). We define a neighbourhood retraction g of x*(C) in X 
by g(x) = (2, f(x(x))), cea ?(V’). Therefore 7*(C) is a neighbourhood 
retract of XY. 


238 Ss. D. LIAO. 


Proof of Theorem 4. The empty set E is obviously a compact neighbour- 
hood retract of X of dimension < m with H™*(E£,9#) =0. Hence J) implies 
that X is connected. It follows that the polyhedron B is also connected and 
arewise connected. All fibres of XY have therefore the same homotopy type. 


If n=0, Theorem 4 is obvious. We therefore assume n>0. The 
polyhedron B can be decomposed into a complex K of mesh < €) so that for 
any be B, the closed star M, of b with respect to K is a proper subset of B. 
Let 6 be an arbitrary point of B. Denote by Ny the boundary of M,. 
Clearly N, disconnects B, and is a compact neighbourhood retract of B. 
Thus, z?(N,) disconnects 1, and is a compact neighbourhood retract of X 
by Lemma 5. Making use of the Corollary to Theorem 3, we have 
dimz?(N,) <m. It follows from J) that H™ #) Con- 
sider the product space Py =2"(b) X Ny. By Lemma 4, P, and w*(N,) 
have the same homotopy type, whence R) = and 
we have 


(10) R) ~0, for any be B. 


Clearly? dim P) + dim and dim = dim (b, B) —1. 
From Theorem 1, dim z1(b) + dim (b, B) = dim (71(b), X) S dim X = m. 
On the other hand, (10) gives dim P» = m—1. It follows that 


(11) dim 71(b) + dim = m —1, for any be B. 
Let & denote the greatest dimension of the fibres of X. We say that 
(12) H*(x1(b), R) ~0, for any be B. 


(12) is obvious when & =0, for x*(b) is non-empty. Consider the case 
k > 0. There is a point by of B with dim =k. If H*(a*(bo), R) = 9, 
then we shall obtain by making use Lemma 3 and by (11), that H™"(P»,, Rt) 
= 0, contradictory to (10). Thus H*(z-1(bo), R) 40. Since, for any be B, 
x 1(b) and x*(6,)) have the same homotopy type, we have (12). It follows 
immediately from (12) that all fibres of XY have the same dimension k. By 
choosing 6 to be an interior point of a ground cell of K of dimension n, we 
get k =m—n. This proves (1). 

To prove (ii), we first notice that dim (b, B) =n for any be B, i.e, 
B is a polyhedron of homogeneous dimension n. Let B be covered by two 
closed proper subpolyhedra LZ, and ZL, built out of cells of K such that no 
ground cells of K are contained in both Z, and Lz, and let By = L, () Ls. 
Then 7+*(B,) disconnects X. By the Corollary to Theorem 3, we have 


331 


I 

a 
8 

d 
fo 
ol 
a 
pe 
(J 
sl 
W 
su 
|_| 


SOME THEOREMS ON THE DIMENSION OF FIBRE SPACES. 239 


dim z*(B,) =S[ m—1, and by similar considerations as before, we can 
establish 
(13) dim 71(Bo) = m—1. 


Putting Y)—7'(B,), the partial mapping z| X, is then a closed mapping 
of XY, on By. Thus, there is a point bo of By such that dim By + dim (bo) 
= dim z*(B,), and hence from (i) and (13), we obtain dim By) = n—1. 
This proves (ii). 


Corottary. Jn Theorem 4, let n>0. We also have for any be B the 
consequences: (iii) (iv) the (n—1)-dimensional 
homology group of B at b [4, p. 121] with coefficients in R does not vanish. 


(iii) follows from (12) and k—=m—n. (iv) is obvious for n=—1. 
For n>1, we have H" (Nz, KR) 40; otherwise, by (11) and by making 
use of Lemma 3, we shall get that Nv, RH) = 0, contradicting 
(10). 


Proof of Theorem 5. Let dim X =m. Let T be an arbitrary compact 
neighbourhood retract of Y of dimension < m which disconnects X and py. 
p2, two points of XY — 7, which belong to different components. There exists 
a sufficiently fine simplicial decomposition K of X such that the following 
conditions are satisfied: 1) If K, denotes the subcomplex of K consisting of 
all simplexes not containing p;, p2 and Ky the subcomplex consisting of all 
simplexes of K which meet 7 and their sides, then Ky C K,; 2) ° Ko can be 
deformed into T by a deformation f;(x) e Ki, 0OStS1, xe Ko, such that 
fo(v) =2,f:(x) eT. K being irreducibly closed,’ there is an m-cycle z™ 
of K, with coefficients in the group G+ (= additive group of integers modulo 
a certain /* => 2) such that |z"|—K. We denote by the com- 
ponents of X — Ky, which contain p,, p2 respectively. Write 2" = > voi, 

i 


v%€ Gis. Let a” be the chain formed by all the terms vio; of 2” for which 
(Int oi) W, 0. It is then easy to see that da” is an (m —1)-cycle in Ko. 

We say that da”~0 in K,. In fact, suppose da"~0O in K,. By 
simplicial approximation there is an m-chain 6” in K, such that da" = 4b". 
We see that c” =a" —b” is a non-zero m-cycle such that | c"| is a proper 
subcomplex of K. But this contradicts the fact that K is irreducibly closed. 

By the deformation f; it follows that f,;(da") “0 in T. This means 


° See [3], p. 343. 
* For the properties of irreducible closed polyhedra, see [3], pp. 274-287; pp. 329- 
331. 


’ 
? 
0 
e 


240 S. D. LIAO. 


that the singular homology group §"7(7T,@%) ~0. Since T, being a 
compact neighbourhood retract of a polyhedron, is a locally contractible 
compactum, its singular and Cech homology groups are isomorphic. Hence 
the Cech group H”(T, Gi) 40. But the latter is isomorphic to a sub- 
group of H™*(T,9) because G is isomorphic to a subgroup of KR and 
dim 7 < m. Theorem 5 is thus proved. 

Remark 3. In a recent paper,® Montgomery and Samelson have made 
the conjecture that there is no compact fibering of the Euclidean space other 
than a homomorphism. Let us remark here that if an m-dimensional 
Euclidean space is a fibre space over a polyhedron, then the conclusions (i), 
(ii) in Theorem 4 and (iii), (iv) in the Corollary to Theorem 4 also hold. 
These can be established by making use of Theorem 4 with a slight modifi- 
cation. The detailed proofs will be omitted here. 


INSTITUTE OF MATHEMATICS, 
ACADEMIA SINICA. 


REFERENCES. 


1. W. Hurewitz and N. Steenrod, “ Homotopy relations in fibre spaces,” Proceedings 
of the National Academy of Sciences, vol. 27 (1941). pp. 60-64. 

2. W. Hurewicz and H. Wallman, Dimension Theory, Princeton University Press, 
1941. 

3. P. Alexandroff and H. Hopf, Topologie, Berlin, Springer, 1935. 

4. H. Seifert and W. Threlfall, Lehrbuch der Topologie, Leipzig, Teubner, 1934. 


®D. Montgomery and H. Samelson, “ Fiberings with singularities,” Duke Mathe- 
matical Journal, vol. 13 (1946), pp. 51-56. 


Ac 


t] 
| G 
4 
a! 
Pp 
0! 
H 
| fi 
ql 
by 
fo 
{ 

| 
le 
| gu 
in 
Se 
Ar 
the 
Mc 
Me 
| 


SEMILINEAR NORMAL BASIS FOR QUASIFIELDS.* 


By Tapast NAKAYAMA. 


The writer has given recently a theorem which may well be called the 
theorem of semilinear normal basis.1. Namely:? Let Z be a (finite separable) 
Galois extension of a field K with Galois group , and F a subfield of ZL 
(not necessarily containing K) such that F© —F. Then there exists in L 
an element, whose (LZ: A) conjugates with respect to K are linearly inde- 
pendent over F#’ or form a module-basis of Z over F according as (LZ: F) = 
or = (LZ:K). The theorem can, for instance, effectively be applied to obtain 
H. J. Riblet’s theorem of differential basis.* 

The present note extends the theorem to the case of a noncommutative 
domain with a Galois group in the sense of N. Jacobson. To do so we have 
first to make some preliminary observations on semilinear group rings over 
quasifields and their moduli (1). Then we obtain the theorem of semilinear 
normal basis for quasifields in a similar manner as in the commutative case, 
by combining the preliminary discussion with the theorem on normal basis 
for quasifields > (2). The theorem of differential basis can also be transferred 
to quasifields (3). 


1. Semilinear group rings over quasifields. Let © be a quasifield, and 
let there be given a finite group ® of automorphisms of Q. Let ® be a 
subquasifield of Q such that 6 G4, and let § be the totality of elements 
in & which leave @ elementwise fixed. is an invariant subgroup of ©. 
Semilinear group rings 


= (G:1 
(G, 6) = G,64 G.@+---+ Ge )) 


* Received October 6, 1947. 

*T. Nakayama, “ Halblineare Erweiterung des Satzes der Normalbasis und ihre 
Anwendung auf die Existenz der derivierten (differentialen) Basis, I,” Proceedings of 
the Imperial Academy of Tokyo, vol. 21 (1945); II., ibid., vol. 22 (1946). 

* This was stated in a slightly stronger form. 

°H. J. Riblet, “A differential basis for algebraic fields,’ American Journal of 
Mathematics, vol. 63 (1941). 

*N. Jacobson, “ The fundamental theorem of Galois for quasi-fields,’ Annals of 
Mathematics, vol. 41 (1940). 

5T. Nakayama, “Normal basis of a quasi-field,” Proceedings of the Imperial 
Academy of Tokyo, vol. 16 (1940). 

241 


16 


a 

| 


242 TADASI NAKAYAMA. 


are defined as usual; = (€e€0,®). The subring 


= H,@+ H,.6+---+ He (h = (§:1)) 
is an ordinary group ring of § over ®. Let Z be the center of ® Then 
(9, ®) (9, Z) x® (over Z). 


Let n be the radical of (§,7Z), and let 
(9, 2)/n =a, + 


be the decomposition of the semisimple residue ring into simple ideals. We 
have 


(9, ®) /n® = ah + 9,6 ++ a,® 


with simple a;#, and n® forms the radical of (§, ©). 
Now 


(G, ®) = G,(§, ®) G2(, &) ape G1 ®) 


where {G,, Go,: - -, G+} is a representative system of & mod. §. Since § 
and Z are invariant under (§, 7) G = G(, 7) whence nG = Gn, G being 
an arbitrary element of ©. Hence 


m= G,n®@ + G.n@+-- -+ Gin 
is a nilpotent (two-sided) ideal in (G,®). And 
(G, ®) /m G1 (9, /n® G2(9, /n® + G1(9, ®) /n®. 


Furthermore a; >G-'a;G@ is a permutation of a, Q2,° -,a,. Let § =, be 
the totality of elements G in @ which map a, on itself, and put G6 = 94, 
+ 3G.+---+3G,. Then a; = Gi1a,G; =a,% ((=1,2,---,s) forma 
transitivity system under @. We set b=(a,+a.+:-+-+a5)® and 
consider 

c= G.6+.---+ 


under, however, 


AssumPTION Ip. The group of automorphisms of 1s 
outer, that is, every class in %/% except the unit class induces an outer 


automorphism in ®. 


The system c is then a simple ring, as one sees by the well-known argument 

of crossed product theory. Namely, its b-double-submoduli — 1,2, 
-+,¢; are all irreducible and are mutually non-isomorphic. 
For, if Gua; and Gyaj® are isomorphic, then obviously ij. Further the 
multiplication by Gya;G,7? on the left shows that Gy*Gya;Gu"G, must coincide 


We 


ing 


be 
Gy 


nd 


SEMILINEAR NORMAL BASIS FOR QUASIFIELDS. 243 


with a; (=aj). Hence G,-'Gy induces an automorphism in aj, whence in 
a;®. Because of our assumption this automorphism of a;® is outer unless 
Therefore, as one sees easily, a;@-double-moduli Gpaid, 
are isomorphic only when pv. So c= 3G,a;® is the (unique) completely 
reducible so-called ideal decomposition of ¢ as b-double-module, and every non- 
zero b-double-submodule contains at least one of Gyai®’s. It is now readily 
seen that if this submodule is a two-sided ideal of c, then it contains all 
GyasP, whence it coincides with c. Thus ¢ is simple. 

For each transitivity system of a; we obtain an analogous simple ring 
(under the respective assumption), and (G, ®)/Gn = (G,&)/m is their 
direct sum. Hence m is indeed the radical of (G, ®). 

Thus a (@, @)-right-module is completely reducible, if and only if it is 
completely reducible as an (,#)-module; observe that it is annihilated by 
m and n simultaneously. Our above analysis shows also that two irreducible 
right-moduli of (@,®) are isomorphic, when (and only when) they are 
isomorphic as (,®)-moduli; they are isomorphic to irreducible right ideals 
of the same simple component c belonging to the same transitivity system. 
The same holds for completely reducible moduli too. 

As for a general (G,#)-right-module 8, the submodule §m = $n®@ = $n 
is the smallest submodule such that the residue module is (G, ®)-, as well as 


($, ®)-, completely reducible. Suppose, now that § is isomorphic to (G,®) 
as (§,®)-module. = is obviously (§, ®)-isomorphic to (G, ®) /m. 
But then, by the above remark on the completely reducible case it is even 
(G, &)-isomorphic to (G,#)/m. In particular, there exists an element we 8, 
so that u(G, ®) (mod. $11) = 8 (mod. $m). Since 8m is the intersection of 
all maximal @)-submoduli of 8, then u(G,&) =8 too. So 8 is (G,&)- 
homomorphic to (@,®); the homomorphism must be an isomorphism, as 
one sees by observing composition lengths for instance. The same argument 
holds also when the (G,®)-module 8 is a direct sum of (§, &)-submoduli 
($,@)-isomorphic to (%,#); there comes a certain number of w’s instead 
of the single w. 

More generally, it is also valid in the case where 8, a (G, &)-right-module, 
is a direct sum of a (finite or infinite) number of (§,)submoduli, each of 
which is ($, &)-isomorphic to a directly indecomposable right ideal direct com- 
ponent of (G, ®). For this last has a form e(G, ©) with a primitive idempotent 
element e in (G, ©), and so 8/8m = 8/8n is a direct sum of (§, ®)-submoduli, 
each (§, ®)-isomorphic to a module of the form e(G, ©) /e(G, &)n = e(G, &)/em. 
Then, again by the above remark on the completely reducible case, it is 
(G, @)-isomorphic to a direct sum of (G, )-submoduli, say 8,, which may 
well be different from the submoduli just observed, each of which is however, 


§ 
18 
ter 
nt 
2, 
ic. 
he 
de 


244 TADASI NAKAYAMA. 


(G, &)-isomorphic to a module e(G, ®) /em, say en(G, ®) /epm. Let up be, for 
each an element in such that $m) corresponds to en (mod. eum) in 
the isomorphism between 8, and en(G,®)/eum. Then these elements up 
altogether (G, ®)- generate $, since $m is contained in all maximal submoduli 
of 8. Now construct, not in (G,®) but abstractly, the (restricted) direct sum 

= defines a (G%,®)-homomorphic mapping of 
onto §. We want to show that it is really an isomorphism. This is settled 
easily by the argument of composition length when 8 is (G, ®)-finite. For 
the general case, however, let tv be the kernel of the homomorphism. We 
observe that § is as (§,®)-module a direct sum of submoduli isomorphic to 
directly indecomposable right ideal direct components of ($,®); 8 = 
ty = fv(, ®) with primitive idempotent elements fy in (§,®). Let hoof, 
in the isomorphism, and let vy be a counter image of ty in our (G,®)-, 
whence ($,®)-homomorphic mapping of onto 8. The sum 
necessarily direct, is ($,®)-isomorphic to $ = Xt. Its sum with m is direct. 
So, if ws40, then w bm — bm, the intersection of all maximal (§, &)- 
submoduli of », and therefore necessarily ty = 0, since our mapping induces 
an isomorphism between b/pm and 8/sm. Thus we have, under our above 


assumption, 

Lemma 1. Let 8, r be two (G,&)-right-moduli, and let 8 be a direct 
sum of (§, ®)-submoduli ®)-isomorphic to directly indecomposable right 
ideal direct components of (G,®). If then r and 8 are (§, ®)-isomorphic, 
they are (G, &)-isomorphic too. 

From this follows further 


LemMa 2. Let rand 8 beas in Lemma. If r is a direct summand of § 
as (©, ®)-module, then it is a direct summand in 8 as (G, ®)-module too. 


®*More generally our proof to the lemmas remains valid in the case where 8 
(as well as y) is a direct sum of (©, &)-submoduli all (©; ®) -isomorphic to directly 
indecomposable right ideal direct components of a fixed residue ring of (@, 2). (In 
this formulation is included the above completely reducible case too.) 

In fact, it follows also from the remark that the essential setting in the above 
proof is the following: Let % be a ring with the radical J{ and the decomposition 
B/N = W, + WM. +--+ +, of the semisimple residue ring. Let there be given a 
finite subgroup g of the automorphism class group of %, and consider a crossed product 
(gq, B) (with a factor set). Let it be assumed for every 9[; that every G@ (#1) eq 
mapping 9[; on itself induces an outer automorphism in 9{;; this amounts to the 
condition that for every g-invariant two-sided ideal in the residue ring 9/Jt the group 
g is outer. Then = (gq, Jt) becomes the radical of: (q,@), and Lemmas 1, 2 hold 
for (g,)-right-moduli, (@,@) and (§,#) taken place by and Q. 

It is perhaps of some interest to note that our ((§,@) is, together with (§,®), 
a Frobenius ring. 


SEMILINEAR NORMAL BASIS FOR QUASIFIELDS. 245 


Let namely 
r= rG, ++ (G:9)) 


be the (G,®)-module induced by r, in the sense of Frobenius. Similarly, 
let $* denote the induced module of §. Evidently r* is a direct summand 
of the (G, ®)-module $*. On the other hand, r* is (§, ®)-isomorphic to r%, 
a direct sum of ¢ isomorphic copies of r, as one readily verifies by counting 
the (finite or infinite) numbers of their mutually isomorphic direct indecom- 
posable direct components.S Therefore, by Lemma 1, r* and r* are (G, @)- 
isomorphic. Similarly $* and §¢ are (G,®)-isomorphic. Thus r? is a direct 
summand of 8¢ as (},®)-module. Then® r itself is a direct summand of 


the (G, ®)-module 8. 


2. Semilinear normal basis. Let 0, 6, &, S be as above. Namely, Q 
be a quasifield, G a finite group of its automorphisms, ® a subquasifield of 2 
invariant as a whole under @, and further, § the totality of elements in G 


by which ® remains elementwise invariant. We make 


Assumption I. The group G/S of automorphisms of ® is outer; 
(As a matter of fact, a somewhat weaker assumption in 1 suffices for our 


purpose.) and 
AssuMPTION II. The group § of automorphisms of Q is outer. 


Let Y be the invariant system of § in Q. Then, because of the Assump- 
tion IT, (9: ¥) =h and © possesses a right, say, normal basis over W, that is, 
there exists an element » in Q so that w*1, o%2,- --,o%* form a linearly 
independent right basis of Q over ¥. Hence 2 and (§, ¥) are isomorphic as 
($, ¥)-right-moduli; SHay. They are so, even more, as ®)- 
right-moduli. Here (§, is as an (§,®)-right-module a direct sum 
submoduli isomorphic to ($,®). Hence Q has the same structure as (§, ®)- 
module. On the other hand, (G,®) is a direct sum of t= (6:9) =g/h 
submoduli (§, #)-isomorphic to (§, ®). 

If here (G6: = (W:),, or, what is the same, (G:1) S (Q:6),, 
then (G,®) is a direct summand of © as (§,®)-module. By Lemma 2, 


* Cf. Nakayama, II. 1. c. 1, Footnote 4. 

*Even in (%,®)-infinite case our structure of § implies the uniqueness up to 
isomorphism of its decomposition into directly indecomposable components. See for 
instance G. Azumaya, “On generalized semi-primary rings and Krull-Remak-Schmidt 
theorem,” forthcoming in the Japanese Journal of Mathematics. 


® See footnote 8. 


S 


246 TADASI NAKAYAMA. 


(G,®) is then a direct summand of the (G,®)-module Q. If however 
(G:$) = (¥:@),, then the (G, @)-module (G, ®) contains conversely 0 as 
a direct summand. More precisely, if (¥:@), is finite and = q(@G:9) +w 
(0=w < (G:)), then the (G, &)-module © is a direct sum of g submoduli 
isomorphic to (G,®) and a module which is a direct summand of (G, ®). 
Thus we have 


TueEorEM. Let Q, &, &, § be as above, and assume I (or the somewhat 
weaker I, and its analogues for other transitivity systems, which altogether 
we denote by I’), II. If (6:1) S (Q:@),, there exists an element € in Q 
such that 


are right-, say, linearly independent over ®. If (G:1) 2 (Q:®),, however, 
there exists an element é in Q such that suitable (Q:®), among (*) forma 
linearly independent right-basis of Q over ®. More precisely, if (Q:®), is 
finite and 


+0 (0=v< (G:1)), 
then there exist in Q q elements & (t—1,2,---,q) and an element n 80 
that 

(t= 1,2,---+,q; j= 1,2,°°°,9) 


and suitable v among 
(j='1,2,-- 


together form a linearly independent right-basis of Q over ®. 


3. Differential basis. Let © be a quasifield, and let there be given in Q 
a higher differentiation satisfying the iterative law EO) , 


in the sense of F. K. Schmidt.*? Further, let there be given a group © 
of automorphisms in © such that (€¢) = (é”)@ for every €e€0, Ge. 
Denote the subquasifield of (absolute) constants of the differentiation by & 


We have 
Lemma 3. If &, -,én (€Q) are right-linearly independent over 4, 


then there exist (nnastly) n vectors left-independent over Q among (é, &, 
E,™) (v=0, 4. 


17°F. K. Sehmidt-H. Hasse, “Noch eine Begriindung der Theorie der héheren 
Differentialquotienten in einem algebraischen Funktionenkérper einer Unbestimmten,” 
Journal fiir Mathematik, vol. 176 (1937). 


SEMILINEAR NORMAL BASIS FOR QUASIFIELDS. 247 


This is perhaps more or less well-known, but we shall give its proof for 
the sake of completeness. Let (6, -,&)) with i =1,2,---,m 
be a maximal Q-left-independent system among the above vectors, and assume 
m<n. We denote the (m,n)-matrix (&)i; by =. Then, for each », 
(6: , +, En) (Ani, with A’s The maximal 
number s of Q-right-independent columns in = can not be n;44s<n. We 
can assume without loss of generality that the first s columns are independent. 
Then, for instance, & 0) a, 2, 

-+,m) with Taking A-combinations we have 


(1) Ex —=E Ma, +EMa, Ma, 
for every v. ()-differentiation 
+ + Ma, +--+), 


Now, suppose that the Ist, 2nd,- --,»—41-th derivatives of a; are all 0. 
Then only the first and the last sums remain in (2), thus 


E,W) — (£0 We, = 


But the left-hand side vanishes, because of the relation (1), v replaced by 
v-+ 4p, and the iterative law. Hence é,a, + 6™a,@) +--+ & 


=0(. Since this is the case for y—0,1,2,---, in particular for v=, 
* *,¥m, necessarily a, a.) —0. So we see by 
induction that all the derivatives aj (~#—1,2,3,---) vanish; aje®. 
Thus turns out to be -right-dependent on &, 


Now we assume J (or J,’) and JI, and put k = max((@G:1), (Q:@),). 
There is, according to our theorem on semilinear normal basis, an element 
in such that - are right-independent over where G,, G2, 
-++,G, denote suitably chosen & elements in @. By the above Lemma 3 
there exist v2,° such that 


(E00 E (10) Ge, Ge) (t= 2, k) 


are left-independent over 2. Then (i= 1,2,---,k) are left-inde- 
pendent over the invariant system A of G in Q. For, any A-relation among 
them would imply the same relation among €'”)@ for every G@, in particular 
for each G; (j7 =1,2,:--+,k), whence the same relation for the above i 


vectors. We obtain thus 


THEOREM. Let QO, G, the differentiation E— ® (and §) be as above, 


1 As a matter of fact s =m, as follows from the elementary divisor theorem for 
instance, but s < n suffices. 


= 


248 TADASI NAKAYAMA. 


and let I (or Io’), II be assumed. Then there exists an element & in Q whose 
suitable k = max((@:1), (Q:),) derivatives 


are left-, say, independent over the invariant system A of & in Q. 


CoROLLARY. Assume (Q:),= (G:1), and replace II by a somewhat 
stronger 


AssumpPTion II,. The group & of automorphisms of Q is outer. Then 
p== (Q:A) =g, and (fF), with k=g, forms a left-basis of Q over A. 


Corotiary. Let (as in II,) & be a finite group of outer automorphisms 
of a quasifield Q, and let there be given a differentiation in Q satisfying the 
iterative law and commutative with every operation in &. Suppose further 
that the subquasifield ® of constants is contained in the invariant system A 
of © inQ. Then there is an element in Q such that its suitable g = (G:1) 
derivatives, including the element itself, form a (linearly independent) left- 
basis of Q over A. 


Remark. If we choose in Lemma 3 the first system, in the sense of 
lexicographic order of (11, v2, * *,ve), then together with a »; there appears 


in the system every v < vj such that pf (“) » p being the characteristic of Q. 


(In case p=0, this means simply that the system is (0,1,- + -,4—1).) 
This can be seen in the same manner as in the commutative case; see 
Nakayama, l.c. II. Hilfssatz 2. So the system (+) in the above theorem, 
or its corollaries, may be taken so as to satisfy this condition. 

Throughout in the above we have restricted ourselves to quasifields. 
However the Galois theory has been extended recently by G. Azumaya and 
the writer to simple rings, and more generally, to closed irreducible rings.” 
It is also possible, though perhaps not of much interest, to transfer the 
above to such general cases. 


Addendum: [Added in proof] The assumptions I and II for our theorem 
on semi-linear normal bases may be replaced, as the author has found recently, 
by the assumption that G forms a group of outer automorphisms in Q. It is 
possible to embrace the two cases in one. See forthcoming paper by the 
author entitled “Galois theory for general rings with minimum condition” 
and by G. Azumaya entitled “ Galois theory for uniserial rings.” 


NaGcoyA IMPERIAL UNIVERSITY. 


12T, Nakayama and G. Azumaya, “On irreducible rings,’ Annals of Mathematics, 
vol. 48, pp. 949-966. 


LEFT ASSOCIATES OF MONIC MATRICES, WITH AN APPLICA- 
TION TO UNILATERAL MATRIX EQUATIONS.* + 


By JaMEs H. BELL. 


Introduction. Let ¥[A] be the ring of polynomials in the indeter- 
minate A, over a field F. This paper will deal with square matrices of order 
n with elements in ¥[A] unless otherwise specified. 

A matrix M with elements in #[A] may be written: 


& 
M = + +- Myo => 


m=0 
where each My, (m=0,1,2,:--,k) is a matrix with elements in ¥. If 
Mi 0, M is said to be of degree k in A. When M, is non-singular, M is 
said to be proper of degree k in X. In particular, if M;, is the identity matrix 
I, the matrix M will be called monic? of degree k. 

A matrix T with elements in ¥[A] is unimodular if it has an inverse 
whose elements are also in ¥[A]. Two matrices A and B are called left 
associates if there exists a unimodular matrix 7 such that TA=B. The 
relationship of left associate is an equals relationship [1]. 

The totality of m Xn matrices with elements in ¥[A] may be divided 
into classes of left associates. Each class is represented by a unique matrix 
in canonical triangular form. This canonical triangular matrix is a matrix 
having all of the elements below the main diagonal equal to zero. The 
diagonal elements if they are not zero, are monic polynomials. The elements 
of each column are reduced modulo the main diagonal element when that 
element is not zero. If the main diagonal element is zero, then every element 
of the row, in which it occurs, is zero.® 

The problem considered is that of determining under what conditions 
a matrix will be the left associate of a monic matrix of degree k. The special 
case of this problem, arising when k —1, is of value in the completion of 


* Received September 11, 1947; revised August 17, 1948. 

? This problem was considered by the author in his doctoral thesis carried out under 
M. H. Ingraham at the University of Wisconsin. It was first submitted to the American 
Journal of Mathematics in June 1941, and in revised form September 11, 1947. 

* This terminology is adapted from that used by Birkhoff and MacLane in their 
book: A Survey of Modern Algebra (Macmillan 1946), in which a polynomial is termed 
monic if the coefficient of the term of highest degree is unity. 

’The Hermite normal form as defined by MacDuffee is the form generally used. 
However, the canonical triangular form defined here will be used in keeping with that 
used by Ingraham [2]. It is obtained in the same manner as the Hermite normal form 
except that the operations are carried out on the columns in reverse order. 


249 


250 JAMES H. BELL. 


the algorithm of M. H. Ingraham [2] for the solution of the unilateral 


matrix equation over F. 


m=0 
Necessary and sufficient conditions are obtained in answer to this problem. 
The canonical triangular form of the given matrix may be readily found [1]. 
The problem then becomes one of determining whether the canonical tri- 
8 
angular matrix A = > Amd” is a left associate of a monic matrix of degree k. 


m=0 


It is found that if the diagonal elements of A are ajj, (7 =1,2,°-° -,n) 
then the degree of ( Il Amm) must equal kn and the degree of ( II Anm) S kj. 


m=1 m=1 


If these conditions are satisfied, then a necessary and sufficient condition 
that A be a left associate of a monic matrix of degree & is that the kn X sn 


matrix 
As As-1° * Asner’ 
Wem 
0 +A, 


with elements in ¥, be of rank kn. 


In the proof of the above conditions, the method for the construction of 
the unimodular matrix which transforms the matrix A into the monic matrix 
is given. Details for the construction for the case k = 1, are given. 

The author has extended the results to the case when # is a quasi-field. 
The conditions obtained are the same if, in place of the rank of W;, the idea 
of left-row rank, or right-column rank, is substituted. Also, Ingraham’s 
algorithm may be extended to the solution of the unilateral matrix equation 
over a quasi-field. In this paper, for the sake of brevity, no proofs are 
presented. The reader is referred to the author’s doctoral thesis [3] for 
greater detail. 


1. Left associates of monic matrices. In the following work T will 
be used to represent a unimodular matrix. The canonical triangular form 


of the matrix to be tested will be represented by A = (aij) = > Amra™. A 
m=0 


monic matrix of degree k will have the form A*J + B, where B = (6;;) is of 
degree less than k. The problem is to determine the conditions under which 
a unimodular 7 exists, such that TA = A*I + B. 


Lemma 1. If two monic matrices are left associates they are identical. 


Suppose there is a matrix U such that 


ral 


LEFT ASSOCIATES OF MONIC MATRICES. 251 


k-1 r-1 
+S Bua”) = al Cnr”; 
m=0 m=0 
then k =r. 
If U is unimodular, 


k-1 
MI + > Bus m — [J- 
m=0 m=0 
and k =r, 
Therefore, if U is unimodular (that is, if the two monic matrices are 
left associates) and Bn=Cm 
If a matrix is the left associate of a monic matrix, that monic matrix 
is unique. The canonical triangular form of a matrix is also unique. A 
monic matrix is non-singular. Therefore, it follows that the unimodular 
matric T which transforms the triangular form into the monic matrix is 
unique. 
Represent the degree of the element aij by d(aij). If ajj 40, then for 
iA~j either aj; =0, or d(aij) < d(aj;). 
THEOREM 1. [f A is in canonical triangular form and is the left associate 
of a monic matrix of degree k, then d( TI mm) =kn and d( IL ann) = kj. 


m=1 


If determinants of both sides of the equation TA = A*I rn B are taken, 
then 


n 
degree kn. Since T is unimodular and | A | J] 4@mm where each dmm is a 
m=1 


monic polynomial or zero, it follows that |7T|—1, |A|~0, and 


d( II Gum ) => kn. 


m=1 


The matrix equation TA =A‘I + B may be written as follows: 


| T 12 | + Bu Biz 
T'22 Boy MT + Boo 


where Ay, Z,, and B,, are Xj matrices. Since + Bu, 


it follows that d( II amm) < kj. 
m=1 


Au Ais 
0 Azo 


Lemma 2. The degree of T= (tij), where TA=NI-+B and A is in 
canonical triangular form, is at most equal to k. An element of T is of 
degree k if and only if it is a main diagonal element and the corresponding 
main diagonal element of A is unity. 

Assume ¢;;, the element in the i-th row and j-th column of 7, is an 


element of highest degree. By hypothesis = + bi; the 
Kronecker delta). Since d(bij) < k, d(an;) <d(ajj) or Qmj =O (for 


m. 
1]. 
Ti- 

k. 
n) 
kj. 
on 
sn 
of 
rix 
ld. 
lea 
n’3 
on 
are 
for 
ill 
rm 

A 
of 
ich 
al. 


252 JAMES H. BELL. 


my~j) and d(t;;) =d(tim); then tijaj; is the term of highest degree. 
Therefore, 


d(tij) Sd(tijaj;) = d( timmy) Sk. 


m=1 
It follows immediately that d(ti;) —d(tijaj;) =k if and only if 8; ~0 
(i.e. and aj; —1. 


From this point on, it will be assumed that the elements aj; (j =1, 2, 
-+,mn) satisfy the necessary conditions of Theorem 1. The proof of the 
second condition given in the introduction will now be carried out. 


Since d( Il amm) —kn and d(T] amm) <#(n—1), then =k 
and s=k, Shite s is the degree of : The elements of A may be written 
=> mdr” where @ijm=0, if m>d(ajj;) or t>j, and 
The coefficients aij,m belong to F. 

Consider the i-th row of T. From Lemma 2 tip = 8 78ipA* + 3 tind 

m= 


(p=1,2,---,n), where if =i+0. Then, for a fixed i, 


n k-1 8 
= (8078: pA* + tp, mA™) ( + bij. 


p=1 
Therefore, 
n k-1 8 8 ; 
(> ( —= — (J + + Bi. 
p-1 m=0 m=0 m=0 


’ The coefficients of A*** on either side are zero since 8 jaij2—=0. A 
system of s equations in kn unknowns tipm (p=—1,2,:--+,n; m=0,1, 
-++,k—1) is obtained for a fixed j, by equating coefficients of A* 
(u=s+k—1, s+hk—2,---,k). They are as follows: 


k n 


m=1 
(v =1,2,---,8). 
For each value of v let j = 1,2,---,n. A system of equations sn in number, 


j k n 

m=1 

(v=1,2,---,8; 7==1,2,---,n for each v), 


is obtained. 


‘When is written as it will be understood that a,; is written in 
m=0 


descending powers of i. 


LEFT ASSOCIATES OF MONIC MATRICES. 253 


If the equations are written out in the order given above, the matrix 
of the coefficients of the kn unknowns tipx-m is 


Ag Ag-1° Ag-ks1 A; 

Since d(aj;) =], dpjm—=0 for m >j. Therefore, each side of the 
equation displayed in (1) above, is identically zero for v= 1,2,---, s—j. 


Therefore, the s equations reduce to j equations, and the system of sn 

equations in (2) reduces to >j (=kn) non-trivial equations (2’) not 

displayed here. If these gables have a solution, a matrix JT may be 

constructed so that TA —A*I+B and since d( TI ann) =kn, it follows 
m= 


that T is unimodular. If a unimodular 7 exists such that TA — I+ B, 
the system of equations (2’) has a unique solution. A necessary and sufficient 
condition that the system have a unique solution is that the rank of the 
matrix of the system be kn. This, in turn, requires that the rank of W; be kn. 


THEOREM 2. A necessary and sufficient condition that the canonical tri- 


8 
angular matrix A = > Amd", be the left associate of a monic matrix of degree k 


m=0 
is that d( TL nm) =kn, d( TI dmm) <= kj, and the rank of the matrix Wi, 
m=1 m=1 
defined above, be equal to kn. 


If the rank of W; is kn, the matrix T may be obtained by setting up 
the equations as in (2) and solving for the elements tipm (p= 1, 2,° 
m=0,1,---°,k4—1) for each value of i—1,2,---,n. Once the coeffi- 
cients tipm have been found, the element tp of T may be constructed by 


k-1 
setting tip pA* tip,mA™. 
m=0 
The monic matrix is then obtained from the equation 7A —A*I + B. 
2. Application to the unilateral equation. M. H. Ingraham [2] has 


shown that the solution of the unilateral matrix equation > RnX™ = 0, over 


m=0 

#, may be carried out in the following manner. The matrix X, with 

elements in ¥, is a solution of the unilateral matrix equation if and 
r 

only if is a right divisor of the matrix R= > that is, 


m=0 


0 
e 
k 
ij 
? 


254 JAMES H. BELL. 


R=M(Al —X), where M is a matrix with elements in ¥[A]. Both sides 
of this equality may be multiplied by a unimodular matrix U. Also a 
unimodular matrix 7 and its inverse T-* may be inserted without affecting 
the equality. That is, UR = UMTT™ (Al — X), or Q=PA where Q = UR, 
P=UMT and A=T-'(AI —K). The matrices U and T may be so chosen 
that Q and A are in canonical triangular form. 

The method of solution is to reduce FR to its canonical triangular form 
Q, and then to find the canonical triangular form A of each possible right 
divisor of Q. Since Q and A are triangular, the matrix P is necessarily 
triangular. Therefore, = pjjaj;j (J The elements are 
selected from the divisors of the main diagonal elements of Q in accordance 
with Theorem 1, as it is desired to have TA —AIJ—X. The other elements 
of A are obtained by solving congruences of the form 


jt+h-1 


(3) Pij%j,jsh = — MOA 
m=j+1 


Equation (3) gives a recursive formula by means of which the succes- 
sive columns of A may possibly be determined. The congruence may have 
no solution, a unique solution, or many solutions. The criterion for solution 


of the congruence is that the greatest common divisor of pj; and dj.n,;.. shall 
jth-1 

divide gj,j..— > Pjmdm,jin. If pj; and aj.n,j.n are relatively prime, there is 
m=j+1 ; 

a unique solution. If the division is impossible at any stage, there is no 

solution, and there is no right divisor A having the chosen main diagonal 


elements. 
Once a right divisor A is obtained, it is then necessary to determine 


whether or not it is the left associate of a monic matrix of degree unity. 


8 
If A= D> And”, the matrix W; becomes W, where 
m=0 
W, = | Ag A, 


The necessary and sufficient condition that A be the left associate of a monic 
matrix of degree 1 is that W, be of rank n. 

If W, is of rank n, it is possible to find T such that TA —Al + B, 
where B now has elements in ¥. It follows that XY = — B is a solution of 


the unilateral equation. 
It is possible to find XY without finding T explicitly. When & =1, the 


equations (2) become 


n 
= — bo + 


(4) 
1,2,-->+,8; gael, 


Consider j for those values only such that d(aj;) —=j>0. Designate these 


va 
ap 


wh 


des 


(8 

is 

of 

| of 

th 

| ran 

anc 

of 

inv 

else 

has 

mat 

for 

fielc 

min 

nec 

forn 
the 


LEFT ASSOCIATES OF MONIC MATRICES. 255 


values of j by 4, where ac Since 
Apjm=0 (m> 7), the system of n equations 


n 
p=1 


(5) 
(m =j,j —1,° 


is equivalent to the equations in (4). If the matrix W, of the coefficients 
of the unknowns, tipo, be written out, it will be seen that it is the matrix 
which is obtained by these steps: ® 


1. Augment each element of A by proper powers of A with zero 
coefficients so that each element in a particular column is a polynomial in 
descending powers of A, equal in degree to the main diagonal element. 


2. Split up each column into columns, each one involving monomials 
of the same degree in A. 


3. Delete all columns involving constant terms only, and then set A = 1. 


A necessary and sufficient condition that there exist a matrix T such 


j n 
that TA =AI — X, where d( amn) S j and d( anm) ts that the 


= m=1 m=1 
rank of W, equal n. 

To find X, set up the system of equations DW, = K, where D = (tipo) 
and K is constructed as follows: If aj; =1, the i-th row of K is the negative 
of the 1-th row of the matrix obtained from step 2 by deleting the columns 
involving ajj,j and setting A=1. When ay ~1, the 


i-th row of K has unity in the r-th place, where r=d(J[][@mm), and zero 
m=1 


elsewhere. By letting i—1,2,---+,n, each row of K is obtained. If W, 
has rank n, then D—KW,. Set where is a diagonal 
matrix and ej; = 8 (7 =1,2,:--,n). The equation TA = Al — is true 
for’ —0. Therefore ¥ —— K W,"A, is a solution of the unilateral equation. 


3. The non-commutative case. In this case ¥ is taken to be a quasi- 
field. The polynomial domain ¥[A], where » is a commutative indeter- 
minate, is a principal ideal ring. Multiplication among the elements is not 
necessarily commutative, but d(gf) = d(fg) =d(g) + d(f) where g and f 
are non-zero polynomials. 

The above work depends upon the existence of a canonical triangular 
form for a matrix with elements in ¥[A]. V. J. Varineau [4] has shown 
the existence of a Hermite normal form for a matrix with elements in a 
non-commutative principal ideal ring. The proof follows the method given 


° Published without proof as a note to Ingraham’s paper. Ingraham [2]. 


256 JAMES H. BELL. 


by MacDuffee, but is complicated by the lack of determinants. Also, the 
greatest common right divisor must be used. 

Once the matrix is reduced to its triangular canonical form, the results 
of this paper, with the exception of the proof of Theorem 1, may be carried 
out in the same manner. The concept of rank must be replaced by left row 
rank or right column rank. The proof of Theorem 1 must be revised when 
# is a quasi-field, as determinants may not be used. The results of Theorem 1 
are true, and the reader is referred to the author’s doctoral thesis [3] for 
details of the proof. 

Ingraham’s algorithm depends upon the existence of a factor theorem for 
polynomials in AJ, with matrix coefficients whose elements are in ¥. The com- 
mutativity of the elements of the coefficient matrices is not necessary in the 
proof of the factor theorem. Ore [5] has made a study of congruences of 
the type (3) over the polynomial domain of a quasi-field. Therefore, the 
solution of the unilateral matrix equation may be carried out in the same 
manner as for a commutative field. 

In this paper left associates and right unilateral equations have been 
considered. Similar results may be obtained by dealing with right associates 
of monic matrices and left unilateral equations. For example, instead of 
TA=NI+B it would be AT—.AI+B. In this event, the canonical 
triangular form would have the row elements reduced modulo the main 
diagonal elements. Theory may be developed similar to that already carried 
out above. In fact, when § is a field, the problems are thrown back on the 
work already done by simply taking the transposes of the matrices involved. 


For example, implies A?,(X7)™—=0 (where AT is the 
m=0 m=0 
transpose of A). 


MICHIGAN STATE COLLEGE, 
East LANSING, MICHIGAN. 


BIBLIOGRAPHY. 


{1] C. C. MacDuffee, The theory of matrices. Chelsea, 1946. 

{2] M. H. Ingraham, “ Rational methods in matrix equations,” Bulletin of the American 
Mathematical Society, vol. 47 (1941), pp. 61-70. 

{3] J. H. Bell, Topics related to the factorization of matrices. Submitted as his doc- 
toral dissertation, at the University of Wisconsin. December 1940. 

{4] V. J. Varineau, An extension of the theory of matrices with elements in a principal 
ideal ring. This paper was submitted as his doctoral dissertation, at the 
University of Wisconsin. July 1940. 

[5] O. Ore, “ Formale Theorie der linearen differential Gleichungen,” Journal fiir die 
Reine und Angewandte Mathematik, vol. 168 (1932), p. 235. 


a 


he 


A SUMMABILITY THEOREM FOR DOUBLE ORTHOGONAL 
SERIES WHOSE COEFFICIENTS SATISFY 
CERTAIN CONDITIONS.* 


By JOSEPHINE MITCHELL. 


1. Introduction. Let {¢mn(z,y)} (m,n=—1,2,---) be a complete 
orthonormal system of real functions of class Z* defined on the rectangle 
Q(assSsb,ecSt=d). By orthonormality we mean that 


(1.1) f dnn(8, t)bpa(8, t) ds dt = 
Q 


where == 0 if and 1 if a—b. (All integrals considered in this 
paper are Lebesgue integrals.) An orthonormal system is complete with 
respect to functions f(s,¢) of class L? if 


(1.2) f f(s, t)bmn(8, t)ds dt = 0 
Q 


implies f(s,¢) = 0 almost everywhere on Q. 
The orthogonal development of any real function f(z,y) of class L? 
with respect to the system {¢mn(2,y)} is given by 


where 

(1. 4) f(s, t)dmn(s, t) ds dt (m,n = 1,2,° °°). 
Q 


The series (1.3) shall be referred to as the double orthogonal series of f(z, y). 
We shall frequently use the series 


(1. 5) mn. 


m,n=1 


We are interested in the connection between the limit of the mn-th 
partial sum 


msn 


(1. 6) Snn = (2, y)) 


of the orthogonal series (1.3) and the limit of the Cesaro (C,1,1) sum 


* Received October 27, 1947; presented to the American Mathematical Society, 
September 1947. 


257 


= 
C- 
| 
| 


258 JOSEPHINE MITCHELL. 


msn 


(1. 7) = (1/mn) > 


ao 
For simple orthogonal series it is known that if series Sa°n < 0, then 
n=1 


on—> a limit s almost everywhere in Q if and only tf the subsequence so» —>s 
almost everywhere in Q [3]. In this paper we are able to prove that if the 
two series 


oO 
(1.8) [og(m + 1)] i. > [log(n + 1) 

where « is any arbitrary positive number, both converge, then the double 
orthogonal series (1.3) ts (C,1,1) summable to the sum s almost everywhere 
in Q if and only if the subsequence {s(2™,2")} (s(2™, 2”) = so™ 2") 
approaches the limit s almost everywhere in Q as m and n—> o (Theorem 
3.1). For the double series we find it necessary to assume the additional 
hypotheses concerning series (1.8) in order that certain “ cross-product ” 
series may converge (cf. formula (3.4) ). 

The method of proof used in this paper is a generalization to double 
orthogonal series of that used for simple orthogonal series. The results on 
convergence and summability of these series are all stated in [3, Chapter V] 
but many of the ideas used in our proofs were obtained from a study of 
numerous papers on convergence and summability of simple orthogonal series 
to be found in Fundamenta Mathematicae. All the theorems proved in this 
paper hold for higher dimensional spaces but for convenience we state them 
for the two-dimensional case. 

A tool which is used frequently in our proofs is the Schwarz inequality 
for sums and integrals. For sums this inequality reads 


N N oN 
i=1 
and similarly for integrals. For the expression “ almost everywhere ” we use 
the abbreviation a.e. The symbol } > means that we sum in the order 


> (>), while } means that any order of summation may be used. 
i,j 


2. An integration theorem. The following theorem, which is a 
generalization to double sequences of a well-known theorem in’ Lebesgue 


integration for simple sequences [3, Chapter I, #2], is necessary for our 


methods of proof. 


* Logs are taken to the base 2. 


SC: 
| 


DOUBLE ORTHOGONAL SERIES. 259 

THEOREM 2.1. Jf the functions fij(s,t) (4,7 =1,2,---)eL on 

QiassSb,cSt=d), fij(s, t) S fey(s,t) ae. on Q, and 

f fuj(s, t)ds dt <C (C independent of i and j), then lim fi;(s, t) exists 
Q 


ae. in Q. 
It is often convenient to use Theorem 2.1 as a theorem on series, viz. : 


4,j=1 


CoronLary 2.1. Tf gij(s,t) 0 and gi;(s,t)ds dt < 0, then 
Q 


Dd gij(s, t) is convergent a.e. in Q. 
i,j=1 


3. On the (C,1,1) summability of the double orthogonal series 
(1.3) if the series (1.8) converge. 


3.1. If series (1.8) are assumed to be convergent we can prove the 
following theorem which corresponds to the result obtained for simple 
orthogonal series. 


THEOREM 3.1. Jf series (1.8) converge, then a necessary and sufficient 
condition that the double orthogonal series (1.3) be (C,1,1) summable to 
the sum s a.e. in Q is that the double sequence {s(2?, 2%)} converge to s a.e. 
in Q as pand 


We wish to prove that 


(3.1) lim [o(m, n) —s(2?, 27)] =0 ae. in Q 


(229 < mS 29< n S20") if series (1.8) converge. With this aim in 
view we break up the difference in (3.1) into 


(3. 2) a(m, n) — s(2,2%) = [o(m, n) — o(2”, n)] + [o(22, n) — o(2?, 2%)] 
+ Lo(2?, 24) — 8(2?, 27) ] 


and prove that each term of the right side of (3.2) converges to 0 under 
the hypotheses of the Theorem by means of the following set of Lemmas. 


3.2. Convergence of a double sum to 0. From definitions (1.6) and 
(1.7) we have 


Hence 


260 JOSEPHINE MITCHELL. 


(3.4) — Sun (1/mn) 1) (y—1) 


— (1/m) — 1) — (1/n) (v — 1) 


respectively. For the subsequences {2?} of {m} and {2%} of {n} it is readily 
proved by a direct generalization of the one variable methods that 
R(2?, 27) > 0 as p and g—>o (See Lemma 3.1.). The difficulty arises in 
proving that the sequences {P(2?, 2%)} and {Q(2?, 2%)} converge. However 
we shall find that if (1.8) are convergent, then these two sequences —> 0 as 


p and g—> oo. 


Lemma 3.1. If series (1.5) is convergent, then 


(3. 5) lim R(2?, 29) = ~ (#—1) (v— 1) 0 


a.e. in Q [cf. 3, Theorem nie 
Proof. By the orthonormal properties (1.1) we have formally 


(3. 6) (1/4942) 1) (v dt 


Since the terms of the latter series are positive we can interchange the order 
of summation to get” 


2P,29 


(3.7) (1/4742) (v—1)?a*w 


[log p=E [logy 


= (16/9) uy. 


2P,24 


Hence by Corollary 2.1 the series (#—1) (v — 1) 
is convergent a.e. in Q, from (3. 3) follows. 
3.3. Lemma 3.2. If series (1.8ii) is convergent, then 


(3.8) lim P(2?, 27) =0 a.e. in Q. 


? The notation E[log u«] means the nearest integer greater than or equal to log u. 


I 


DOUBLE ORTHOGONAL SERIES. 261 


Similarly if sertes (1. 8i) is convergent, then 


(3. 9) lim Q(2?, 27) =0 a.e. in Q. 


Proof. In order to obtain a majorant for the expression P?(2?, 27) of 
(3.4) which shall be monotonic non-decreasing with respect to the subscript 


27 we break up the sum > in the following manner and apply the Schwarz 


p=1 
inequality 


a=2 y=2 a=2 p=29-141 


Applying this inequality again to the second term on the right we get 
Qa 


<2(3)? + 2A¢ (a—1)**( 


a=2 y=2@-14+1 


a 
where Ap = < oo for any arbitrary positive number. Now let the 


n=1 


summand be 


and sum with respect to p for p=1,2,---,m. Then 
2a 


(3.13) P2(20,20) <2 (Sop)? Sy)? 
p=1 p=1 p=1 a=2 


T' ma, 


and the sequence {Zmg} is monotonic non-decreasing in both m and q 
(m,q=1,2,---). Integrate both sides of (3.13) over the rectangle Q 


and use the orthonormal properties (1.1). We get 


(3. 14) 20) ds dt < Tngds at 


m 


p=1 


p=1 a=2 y=22-1+1 


But a— 1 = log 2%" < logy for y= Furthermore if we 
replace log vy by log (v-+1), then, since log (v+1)21 (v—1,2,-- -) and 


| 


262 JOSEPHINE MITCHELL. 


a 

Ae>n*=7°/4>1, we can combine the two inequalities on the right 
n=1 

side of (3.14) to obtain 


(3. 15) f P2(22, 2%) ds dt < dt 
Q p=1 


<24,.3 > [log (v + 1) ]**«(1/4?) 


By the same method as we used in inequality (3.7%) we can prove that the 
right side of (3.15) is less than or equal to (8/3) A, > [log (v + 1) ]***a?w. 


Hence by Theorem 2.1 the monotonic non-decreasing sequence {7'nq} 
approaches a limit a.e. in Q, that is, the series 


(3. 16) +E Se»)? 


Is convergent a.e. in @. Consequently from this result and the absolute 
convergence of series (3.16), we have 


(3.17) Tim [( + (a— 


= 0 a.e. in 


But by the inequality expressed in (3.11) with the summand py of (3. 12) 
the positive term P?(2?,27) is less than or equal to 2A, multiplied 
by this sum. Hence lim P?(2¥,27) 0 a.e. in @ and consequently 


lim P(2?, 27) 0 a.e. in Q. Similarly (3.9) is proved, which completes 
the proof of Lemma 3.2. From the conclusions of Lemmas 3.1 and 3.2 


and equation (3.4) we see that 
(3. 18) lim [o(2?, 27) —s(2?, 27)] in Q. 


Consequently if series (1.8) converge the sequences {o(2?,2%)} and 


{s(2?, 27)} are equiconvergent a.e. in Q and to the same limit. 
3.4. To complete the proof of Theorem 3.1 we shall prove 


Lemma 3.3. If series (1.81) converge, then 


(3. 19) lim [o(2?, n) —o(2?, 22)] =0 a.e. in Q, 


a 
Dey)? 


DOUBLE ORTHOGONAL SERIES. 


and if series (1. 8ii) converge, then 


(3. 20) lim [o(m,n) —o(2°,n)]=0 ae. in Q 


(229 << mS 29 < n Qa), 


If the integer g be such that 27 < n < 2%, then 


(3. 21) o(2?, n) —o(22, 22) = [o(2, y+ 1) —o(2%,»)], 


and by the Schwarz inequality 


(3.22) [o(2”,m) —o(2, 24) ]? (1/r) vlo(2?,v + 1) —o(2, v)]?. 


Now 


(3.23) (1/v) = (1/24) (1/20 —1) < 24/201, 


so that consequently 


(3.24) lim [o(2,n) —o(2?,22)]?< lim vlo(2, v-+ 1) —o(2?, v) 
v=2 
and similarly 
(3.25) lim [e(m,n) —o(2¢,n)]?S lim Syulo(u+1,n) —o(p, n) ]?. 
p=2P 


From inequalities (3.24) and (3.25) Lemma 3.3 will be a consequence 
of the following Lemma. 


Lema 3.4. If series (1.8i) is convergent, then 


(3. 26) lim Sv[o(m,v+1) —o(m, v)]2?=0 ae. in Q. 


p=24 
Similarly if series (1. 8ii) converges, then 
(3. 27) lim DSplo(u+1,n) —o(p,n)]? =0 ae. in y. 


Proof. From (3.3) we have that 
mv+tm——a+ti1 v—B+2 


(3.28) o(m,v+1)—o(m,v) = Aaphap 


263 
— | 
a; B=1 
But 


264 JOSEPHINE MITCHELL. 


y—-B4+2 
v+1 v v(v+1) 


(3. 29) 


so that 


mov (B—1)(m 1) 
(3.30) o(m,v+1) —o(m, v) 


Qa,vs1a,v+1 


m(v-+1) 


and we break up the right side of this expression into the four sums 


m (B —1)(«#—1) m 


3. 31i = 
(3. 31ii) > => ye) 

m ¢— m (2) 
(3. 31iii) = Qa,vsia,v+1 =24 a,vsi(m) 
(3. 31iv) =D 


+ 1 a=1 


We may obtain majorants for the squares of these four sums which 
. shall be monotonic non-decreasing with respect to the subscript m by the 
following procedure. Let the dyadic representation of m be given by 


(3. 32) m = 2? + 1+ 
k=1 
where z= 0 or 1. Using (3.32) we break up the sum > as follows: 
a=1 
m 2u p-1 
a=1 a=1 a=2h-14+1 a=m 
where 
“ 
(3.34) s(n) 414+ (u—1,---,p—1), 
=1 
s(0) = 2°41. 


(If s(u) =s(u+1) for any » (u—0,---,p—1), then the value of the 


8(u+1)-1 
sum > will be taken to be zero.) 
a=s(p) 
An illustration of this process for the number m= 14 is as follows. 


The dyadic representation of 14 = 2*-+ 1+ (1.2°-+ 0.214 1.2°) so that in 
(3.32), z2=0 and From (3.34), s(0) =9, s(1) =9 


| 
| 
: 


DOUBLE ORTHOGONAL SERIES. 265 


+ 2,27 = 13, (2) = 9 + 2,2? + 2.2 = 13 and s(3) = 9 + 2,2? + 222 + 252° 
a=s(2)-1 


=14. Since s(1) —s(2) thesum } =0. Hence by (3.33) the sum 


a=s8(1) 


14 
> would be broken up into 


(242) + +3, 


By the Schwarz inequality applied to (3. 33) 


and applying this inequality again to the two inner sums we get 


(3. 36) a[( 3)? 


u=0 a=8s 


Let the summand in (3.36) be the expression ¥ v(m) of (3.31i) 
and denote the resulting sum on the right side by #5». Integrating over Q 
we have 


(3. 37) f dt < 4{Az)(m) 
Q 


8(ut1)-1 
T [Ac (4 — TP > JAav(m) + Amv(m) }; 
where 
(3. 38) Aaqv(m) on (8 —1)*(a—1)? 


But »—1 = log 241 < loga<log(a+1) for and 
p=log2?<log(a#+1) for a—s(z) to s(u+1)—1. Hence as in 
inequality (3.15) we get 


(3. 39) dt < 4{Ay(m) 


+ [Ae 435° [log(a+1)]* 


=2 u=0 a=s(p) 


< 4A, [log (a 1) 


a=2 


Now 2? < mS 2”*1, Therefore (1/m) < (1/2) = 2/2? and using (3. 38) 
we have 


2 


266 JOSEPHINE MITCHELL. 


1)2(a—1)2 
(3. 40) dt = 16A, + [log (a + 1) 


Multiply both sides of this inequality by v and sum with respect to p and », 
Using the methods of inequality (3.7) and the fact that 


(3. 41) (1/r*) < 1/8" 
we get i 
(3.42) Sy dt 


< 16A, > > [log (a + 1) 


pw=1a=2 B=2 (v + 


< 164.3 (8 —1)*(a—1)?[log(a + 1) 340-1 
a,B=2 p=p’ v=B 


< (4/3) (1644) (6 —1)?(a—1)?[log(a + 1) 
< (64/3) A, [log (a < 


where p’ = E[log a] —1 if «42, p’=1 if a2. Hence by Corollary 2.1 


co 
the series of positive terms, } vt"), is convergent a.e. in Q. Therefore 


p,v=1 
29-1 
(3. 43) lim > = lim > — 2 IL ® 
p=24 


a.e. in Q. Whence by the definition of ¢, and inequality (3.36) it 


follows that 
2q+1-1 m = (2— 1) 


(3. 44) lim > 


a 2—( a.e. in Q. 
=a=2 B=2 + 1)m Q 


We may discuss the sum given in (3. 31iii) in the same manner as we 
have done for (3. 31i) and it follows that 


(3. 45) 


In inequality (3.36) used with the summand ¥),, of sum (3. 31ii) 
the right side is a monotonic non-decreasing sequence with respect to p. 
Denote it by ¢@ ,. Multiply ¢ by v and sum with respect to v for 
v=1,---,n. The resulting sequence {7} is monotonic with respect to 
both p and n. Integrating over the rectangle Q we have by the same pro- 
cedure as used in inequalities (3.37), (3.39) and (3.42) that 


Aa, 0 a. e. in Q. 


sun 


| 
= (0) ( 
| 
H 
ar 
| 
Sin 
t 


DOUBLE ORTHOGONAL SERIES. 

3. 46 f Tpnds dt S 4A, ] 1) 

( ) Q 22 20041) [og (2 + )] Qa 


4A, (8 —1)*[log(a + 1) ]***a7a¢ 


a 2 


<= [log(« + 1) 


Hence by Theorem 2.1 the sequence {7p} approaches a limit a.e. in Q as 
pand n—> co. Consequently 


(3. 47) lim > vt!) lim — 0 a. e. in Q. 
v=24 psq->00 


Therefore from inequality (3.36) used with the summand ¥®),, it follows 
that 


2g+1- m 


(3. 48) hm > »(> daphap)” = 0 a.e. in Q. 


a=1 B=2 V 
The sum (3. 3liv) may be discussed in a similar manner and we get that 
2q+1-7 m 


(3. 49) lim > 


= 0 a.e. in Q. 


Now from formulas (3.30) and (3.31) and the Schwarz inequality we 
have that 


(3. 50) [o(m, vy +1) —o(m, v)]? 
+ + (Sy Mara(m))? + 


a=2 


Hence it follows from the conclusions expressed in (3.44), (3.45), (3.48) 
and (3.49) that if the series (1.8i) converge, then 


(3. 51) lim 3S v[o(m,v+1) —o(m,v)]? =0 ae. in Q. 


Similarly if series (1.8ii) converge, then equation (3.27) may be proved 
to hold, which completes the proof of Lemma 3. 4. 


3.5. The following theorem gives a sufficient condition for (C, 1,1) 
summability. 


THEOREM 3.2. If the series (1.8) and 


S [log log(m + 8) ]}*[log log(n + 3) ]*a°nn 


m,n=1 


: 
| 
) 
). 
reg 
0 


268 JOSEPHINE MITCHELL. 


all converge, then the double orthogonal series (1.3) is (C,1,1) summable 
a.e. in the rectangle 


Proof. This theorem is a consequence of Theorem 3.1 and the following 
theorem which is due to R. P. Agnew [1, Theorem 16. 1]. 
THEOREM. If the coefficients in a double orthogonal series are such that 


[log log(m + 3) ]*[log log(n + 3) ]2a%mn 


m,n=1 
is convergent, then there exists a function f(x,y) such that lim s(2?, 29) 
=f(z,y) essentially uniformly over 


Theorem 4.3 [1] would also give a sufficient condition for (C, 1,1) 
summability of the double orthogonal series. 


UNIVERSITY OF ILLINOIS. 


REFERENCES. 


[1] R. P. Agnew, “On double orthogonal series,” Proceedings of the London 


Mathematical Society, vol. 33 (1932), pp. 420-434. 
[2] T. J. Ia Bromwich, An Introduction to the Theory of Infinite’Series, Macmillan 


and Co., 2nd edition (1931). 
[3] S. Kaeczmarz and H. Steinhaus, Theorie der Orthogonalreihen, Monografije 


Matematyczne, Tom VI, Warsaw (1935). 


* Essentially uniform convergence over Q implies convergence a.e. over Q. 


a 

( 

a 

| 
F 
H 
ne 
th 
ela 
The 
pre 
in « 
pap 


NOTE ON THE PROPERTIES OF FOURIER COEFFICIENTS.* 


By Cuinc-Tstn Loo. 


1. Introduction. In this paper we shall consider the Fourier series of 
real-valued and L-integrable functions of period 27. We shall often consider 
only even or odd functions. Even functions will be denoted by f(z), and 
their cosine Fourier coefficients by an; odd functions will be denoted by g(x) 
and their sine coefficients by bn. Thus 


(1.1) f(z) COS NX, g(x) ~S bn sin nz, 
n=1 n=1 


assuming for simplicity that the integral of f(z) over (0,27) is zero. We 
shall write 


n 
(1. 2) 
k=1 


=n 


> 


and similarly we define B,, B*,. If the series 


oO oO 2.9) 
(1.3) Ancos nz, > Ansin na, > A*n cos nz, > sin nz 
n=1 n=1 n=1 n=1 
are Fourier series, the functions they represent will be denoted by Fy, Fs, 
F*,., F*, respectively. Similarly we have G., Gs, G*o, G*s. 
co 
Hardy [2] proved that if f(z) e (l1Sp< o), then Ancosnz isa 


n=1 


Fourier series of the class Z?. Recently, Bellman [1] proved? that if 


f(c)e (1< p< o) then A*, cos nx is a Fourier series of class L?. 


n=1 
Here the restriction to p > 1 is essential, since for p= 1 the numbers A*, 


need not exist, as the familiar example of f(x) ~ > cos nz/log n shows. 


On account of the well-known theorems of M. Riesz (according to which 
the series conjugate to a series of the class L? (1 < p< oo) is also of the 
class Z?) the results of Hardy and Bellman show that if f(z)eJ? 


* Received January 27, 1948. 

1The result of Bellman was also obtained by Kawata and Sunouchi [3], [4]. 
These two papers came to my knowledge only very recently after the results of the 
present paper had been obtained. Though the Kawata-Sunouchi theorems have points 
in common with my results, none of the theorems established here is proved in their 


papers, 


269 


n=2 


270 CHING-TSUN LOO. 


(1 < p< o) then all the series in (1.3) are Fourier series of the class L?, 

The arguments of Hardy and Bellman can also be applied to show that if 

g(z) L? (1 Sp< ow) then > B, sin nz is a Fourier series, and if e L? 
n=1 


(1< p< _o) then > Bt, sin nz is a Fourier series, both of the class L?. 
n=1 


By making use of M. Riesz’s theorem we see that if g(r) e L? (l< p< w) 
then the four series obtained by replacing the a, in (1.3) by dn are also 
Fourier series of the class L?. The purpose of this paper is to investigate the 
cases p=1 and p=. They are delicate and interesting. 


2. First, we shall consider the four cases corresponding to the result 
of Hardy. We formulate Hardy’s theorem in the case of p—1. 


THEOREM 1. If f(x) eL then > An cos nz is a Fourier series. 


n=1 


By making use of Hardy’s analysis one can also prove 


co 
TuHeoreM 2. g(x) eL then By sin nz is a Fourier series. 
n=1 


It is interesting to observe that in general, the condition f(x) e L does 
oo co 

not imply the Fourier character of sin nz. For cos nz/log n is the 
n=1 n=2 

Fourier series of an even and integrable function but the corresponding 


n=2 


n ao 
n=1/n> 1/logy—1/logn and Ssinnz/logn is not a Fourier series. 


In this case we see that > a» sin nz is a series conjugate to f(z) ~ S an cos nz. 
n=1 n=1 
It certainly represents a Fourier series of f(x) if | f | log*|f|eL (a theorem 
of Zygmund: if | h | log*|h|eL then he L, where h is not necessarily con- 


fined to be even or odd, and h is the conjugate function of kh). We also 
observe that S dn sinnz is a g(x) with dn —bd,. By using Theorem 2, we 
n=1 


conclude that } A, sin nz is a Fourier series. Thus we have 
n=1 


oo 
TueoreM 3. If |f|logt|f|eL then Ansin nz is a Fourier series. 
n=1 


Using Theorem 1 and applying a similar argument we can deduce 


* Of course it does not follow from the fact that 8, =~ 6’, that the behavior of the 
series $8, sin na and =f’, sin na is exactly the same. However in our example (and in 
similar examples that follow) the assertion is justified by the very regular structure 
of the sequences {B, | B’n}> {P’, | B,}- We omit the proofs which can easily be supplied 


by the reader (see for example Zygmund [6]). 


NOTE ON THE PROPERTIES OF FOURIER COEFFICIENTS. 271 


THeEorEeM 4. If | g|log*|g|eL then B, cos nz is a Fourier series. 
n=1 


We shall now consider the four cases corresponding to the result of 
Bellman. To make the situation clear, we note the following examples. We 


see that f(r) ~ > cos nx/log n is even and integrable but A*, = 1/v log v 
n=2 v=n 


does not exist. This means that the series A*, cos nz and > A*,;sin nz 


n=2 n=2 


in general have no meaning. We also see that g(x) ~ Sin nz/ (log n)? is 


odd and integrable but B*, = > 1/v(log v)? ~ 1/logn and Ssin nz/log n 
y=2 n=2 
is not a Fourier series. However, the following theorems can be established: 


oO 
THEOREM 5. Jf |g | log*1/| eZ then > B* sin nz is a Fourier series. 


n=1 


oO 
6. | f | log+1/|a|eL then A*, cos nz is a Fourier series. 


n=1 


THEOREM 7. If | 9 | logt|g]eL then > B*, cos nz is a Fourier series. 


n=1 
It is interesting to observe that the condition | f | log* | f | © L does not 


imply the Fourier character of 2 A sinnz as shown by the example 
n= 


f(z) ~ > cos nx/(logn)*. From Theorem 5 and the theorem of Zygmund 


that if | (log*|h|)*eL, a>0, then | h| (log we can 
deduce the following 


THEOREM 8. | f | log?|f|eZ then > A*nsin na is a Fourier series. 
n=1 


Indeed, under the condition | f | log? | f|eZ we have 


co 
f(z) ~Sacosnz, f(x) ~ Dansin nex 
n=1 


n=1 


and | f | log (| f|+2)eL. Since f(z) is a g(x) with an—=bn, it follows 


from Theorem 5 that >} A*, sin nz is a Fourier series. 


n=1 


3. This section is devoted to the proofs of Theorems 5, 6, 7. Following 


Hardy, we modify the definitions of An, A*n, Bn, B*n, and write 


(Zygmund [6], p. 106, ex. 4, 5.) 


®The condition given in Theorems 5 and 6 is less stringent than | h | log+|h|eL 


f 
|) 
n=2 

| 
) 


272 CHING-TSUN LOO. 


n co 
An=1/n > — an/2n, A*, = > a/v + an/2n, 
v=1 


v=n+1 
with similar formulas for B, and B*,. With these new definitions one gets 
similar formulas, and the difference between the old and new A, (or A*,, 
etc.) being o(1/n), the difference of the corresponding functions is a 
function integrable in any power (by a theorem of Hausdorff-Young) or 
even exponentially integrable (see Zygmund [5]). We shall use two well- 


known functions. The odd function ~ sinnz/n (0< 
n=1 


and the even function log 1/2 sin $2 ~ > cosnz/n (0 << 2S). The method 
n=1 


used is to find explicit formulas for the functions with A*, and B*, as 
Fourier coefficients. 


Let 
(3. 1) a(x) =4(rsgnz—z) ~Ssin nz/n. 
It is easy to see that si 
and that 


by c0s — f “g(t)a(a + t)dt. 


Hence 


"cos va S + datas, 


and (taking into account that the integral of > bv" cos vz over (0, 7) is zero) 


v=1 


> by/v — bn/2n = {sin nz cot $x + t)dtdz. 
y=1 0 


Subtracting the last formula from (3.2), we thus obtain 
oO 
B*,, = > by/v bn/2n 
v=n+1 


f nx cot Jf — +1) Jatae 
‘sin nx cot f [2a(t) — a(x + t) + —t)]dtdz. 


It remains to establish the integrability of the function 


(3. 3) cot M (a, t)at 


( 
] 
( 
a 
| 
( 
I 
( 
( 
a 
b 
0 


NOTE ON THE PROPERTIES OF FOURIER COEFFICIENTS. 


where 
(3. 4) M (x,t) = 2a(t) —a(a+t) + 2). 
It is easy to see that for x and ¢ both in the interval (0, 7), 
0 


This follows immediately from the definition of «(z). Hence 
I= f “cot M (2, t)dtde 
0 0 


0 0 


Si g(t) | (log dt. 


and 


~ 


Theorem 5 is thus proved. 
In order to prove Theorem 6, we argue as before and we get 


n 


av/v — An/2n = fa — cos nx) cot + t)dtdx 
3.¢) 


=1/r J cos na)eot f FH) [a(x +8) + a(x —t) ]dtdr. 


It suffices to establish the integrability of 


(3. 7) cot t) dt, 
where 

(3.8) N(2,t) —a(a@+t) 
since then 


nx cot F()N (a, t)dtar 0 
0 0 


as n—> © by the Riemann-Lebesgue theorem. Thus 


‘cot S FONG datas, 


by (3.6), and therefore A*, = > av/v + an/2n is the Fourier cosine coefficient 


v=nt+1 


of (3.7). 


278 
, 


274 CHING-TSUN LOO. 


We observe that for x and ¢ both in the interval (0, 7), 


(3.9) N(x, t) =4}n(1+ sgn(x—t)) 


Hence 


f cot de f FON thatae 


— f cot je x) f(t)dtde — 42 S 


The absolute value of the second integral is majorized by 2x f ‘ f(#)| dt, 
0 


while the absolute value of the first integral is majorized by 


Theorem 6 is thus proved. ; 
The method used to prove Theorem 7 is similar. We shall need the 
following 
Lemma. Let 0(x,|h |) = 0(2,|h|,a,b) = sup 1/(e#— ) fi h(t) | dé. 
If |h|logt|h|eL(a,b), then O(a, |h|) L(a, and 


b 


where B and C depend on b—a only. (cf. Zygmund [6]). 
Since 


oo 
g(t) ~ bnsinnz, log1/2|sin $x |~ cos 
n=1 n=1 


we have 


basin nz/n—=1/m  9(t) (log 1/2 | sin (x + t)/2 |)dt. 


n=1 
It follows that 
by/v——2/x* ‘sin ve (log 1/2 | sin (a + t)/2 |) dtdx 
0 
and that 


—bn/2n 
=— 1/7’ fa — cos nx) cot 42 Q(t) (log 1/2 | sin (2 + t)/2 |) dtdz 


=— nx)cot f 10g (nina + 1)/2) 


NOTE ON THE PROPERTIES OF FOURIER COEFFICIENTS. 275 


As in the proof of Theorem 6, it suffices to prove that 
(3.10) —1/2mrcot $x f g(t) log |(sin(a— t)/2)/(sin(a + t) /2)| dt 


is integrable. Let us write G(r) = f “g(t) dt. Since | g|logt|g| is 
0 


integrable, the integral 
(3. 11) fa log | (sin( — t)/2)/(sin(a + t) /2)| dt 


exists (see Zygmund, [6], p. 106, ex. 4). Moreover, it is not difficult to 
see that — G(x) =o(1/| loge|) as e>+0 (see Zygmund [6], 
p. 105, ex. 3; the general result stated there only gives G(# + ¢«) — G(z) 
=0(1/|loge|) but the passage from ‘0’ to ‘0’ is simple). Thus if we 
split the integral (3.11) into the integrals over (0,xz) and (2,7), integrate 
each by parts and add the results we find that (3. 11) equals 


4 (G(t) — sine 
o 2®sin (x—t)/2sin (x +1t)/2 
(G(t) — G(z)) sine 
o 2sin (x—t)/2sin (x+ t)/2 


dt 


dt. 


In order to prove the integrability of (3.10), it suffices to prove that 


| G(t) — G(z)| 
Soot 3x1 a —t]/2)sin + O72 


is bounded. 


It is easy to see that 


| G(t) — G(a)| 
nef J, sin (| |/2) sin (x + t)/2 
If we notice that | «—t| and $(2-+¢) are both positive and not greater 
than we can use the formula =(2/7)@ in 7/2). Thus 


* | G(t) —G@(z)| 
(3. 13) f"f 


276 CHING-TSUN LOO. 


say. Now 


G(x) 
0 0 
0 0 0 
and 


Isat 


and it follows that 

(3. 14) 4I% fz, |)de. 
0 

Also, 


(3.15) f f = — dtde 
0 


IA 


n (| e—t#|/2) sin (x — t)/2 


7 | G(t) — G(2)| sine 
peter sin (| sin (2 —t)/2 


+ 
To treat J’,, we notice that 7/2 S(x + t)/ S 37/4, so that sin(x + t)/2 
=> sin 37/4 = 1/V 2, and 
0 0 0 


Lastly, in J”. we have | cot $z| <1 and sin (x + ¢t)/2 (x + 7) /2 
= cos 2/2, since Therefore 


0 0 


2|a—t|cosa/2 


Hence 


(3. 16) I< Gr? f 9 |) de. 


Combining (3.12), (3.14), (3.16) we find that the integral in (3. 12) 
is certainly bounded by 87? f 6(a, |g|)dx. An application of the Lemma 
0 


stated above completes the proof of Theorem 7. 


g 
g 
( 
v 
( 
d 
t] 
fe 
f 
e 
t] 
( 


NOTE ON THE PROPERTIES OF FOURIER COEFFICIENTS. 277 


4, In this section we shall always assume that our functions f(#) and 
g(z) do not exceed 1 in absolute value. Under this condition we shall 
investigate the integrability of expA|F-|, expA| Ge|, ete. 

We first recall the proof of Theorems 5 and 6 in 3. We see that if 


g(t) ~ > by sin nx then 


n=1 


(4. 1) G*, = cot M(x, at ~S Bey sin nz, 
0 n=1 


where M(z,t) is defined by (3.4) and has the properties (3.5). We also 


co 
see that if ~ an cos nav, then 


n=1 


(4. 2) F*, =x = cot $a (a, t)dt~S A*, cos nz, 
0 


n=1 


where N(az,¢) is defined by (3.8) and has the properties (3.9). We easily 
deduce from (4.1) and (4.2) the following results. 


THEOREM 9. If | g(x)| <1, then | G*,(x)| 
TuHeorEM 10. Jf | f(x)| <1, then | F*,(x)| S2. 


From the above two theorems, from the theorem that if | h(a) | <1, then 
expA | h(x)| is integrable for every A < 7/2 (Cf. [5]), and from the facts 
that G*,(x) is a G*(x), and F*,(z) is an F*,(x), we thus obtain the next 
two theorems. 


THEOREM 11. Jf | g(x)| <1, then 


fTexpa | G*,(x)| dx < 00 
0 
for every A < $x. 

THEOREM 12. If | f(x)| <=], then 


J exp A | F*,(x)| dx < 
0 
for every < 4a. 


In order to deduce the corresponding properties of expdA| F.(2)!, 
expr | F,(x)|, exprA| G*.(x)|, expA| G*s(x)|, we shall make use of an 


argument from Hardy’s paper [2]. He proved there that if f(z) ~ ¥ an cos na 
n=1 
then 
(4.3) F,(2) fot 4uf(u)du~ > An cos nz. 


278 CHING-TSUN LOO. 


oo 
In the same way, we can prove that if g(z) ~ > bn sin nz then 
n=1 


(4. 4) f cot jug(u)du~S Bu sin nz. 
z n=1 

From (4.3) and (4.4) and from the inequality 

we obtain 


13. If | f(x)| <1, then 


Jf ‘expa| Fe(2)| dx < 
0 
for every A <1. 

THeEorEM 14. Jf | g(x)| <1, then 


exp G.(x)|dxr< @ 
0 
for every A <1. 
In order to deduce further results we require the following 


THEOREM 15. If |h| <1, then 
(4. 6) A=4}| f h(t)cot 4¢ dt | 1/rlog*1/x + O(log 1/z), 


where h(t) (h(u)/2 tan §(t—u) )du is the conjugate function 
of h(t). 


Since its proof is rather long, we shall postpone it to 5. 

If we start with a bounded odd function g(x) ~> bdnsin nz, then 
G(x) Observing that g(x) is an f(x) with bn and 
making use of (4.3), we have 


(4.7) Ge(xz) 4 G(t) cot dt ~S By cos ne. 
n=1 


Using Theorem 15 we thus obtain 
16. If | 9(x)| <1, then 


Sema | G(x) | << 
for every \< 


With a similar argument, we obtain 


| 
| 
( 

4 
q 


NOTE ON THE PROPERTIES OF FOURIER COEFFICIENTS. 


TueorEM 1%. If | f(x)| <1, then 


J expa | Fs(x)| < 
for every X< 7%. 


5. Proof of Theorem 15. We shall estimate the integral 


(5.1) A= | f “cot tan —u) )dudt |. 


We notice that we can interchange the order of integration here (for the 
inner integral is a Cauchy integral). If we let 


t-€ 
— /2 tan gu) da, 
€ 
then by a theorem of Zygmund (cf. [6] pp. 249-250), the function | fe(t) | 
is certainly bounded by an integrable function independent of «. Hence 


by the very well-known theorem of Lebesgue about the termwise integration 
of sequences of functions majorized by integrable functions, 


2 f cot f tan $(t—u) )duat 
t-€ 


We now interchange the order of integration and let e—-+0, then we get 
(5. 2) = fiw cot $¢ cot $(t —u) dtdu. 
Using the identity 

cot ¢ cot (¢ —u) = cot u(cot u) — cot t) —1, 


we reduce the above integral (5.2) to 


(5. 3) 2 fiw) cot u/2 f $(¢—u) — cot $]dtdu + 0(1). 


279 

| 


280 CHING-TSUN LOO. 


In order to establish (4.6), we have only to prove that 


(5. 4) fi 4 cot u/2 4(¢ —u) — cot ft] dt | du 


S log? 1/z + O(log 1/z). 


In the following calculations we shall always suppose that 7/2 >2> 0. 
We start our proof by first integrating the inner integral of (5.4). We get 


(5. 5) SJ 4(¢t—u) — cot $t]dt 
= log | sin $2/sin — uw) | — log (1/cos u/2). 


Since the integral 
f | $ cot u/2 log 1/cos u/2 | du 
is a finite constant, it is enough to estimate 


(5. 6) 1—f" 4 cot log | sin $2/sin — u)| | du. 
We observe that 


log | sin $2/sin $(a — u) | = 


Hence J can be written as follows 


log sin $z/sin}(w— for u> 2a, 
log sin $2/sin}(a—u) for u< za. 


(5.7) fi 4 cot u/2 log sin $2/sin 4(x— wu) | du 
+ f "} cot u/2 | log sin $2/sin $(u—2z)| du=I,+ 12, 
say. We shall reduce J, to the form 


sin sin 


sin $x sin 


— f “Foot u/2 SF cot tog 


in — 
and combine this with 


sin $2 


sin $2 


sin 42 


| 
| 


NOTE ON THE PROPERTIES OF FOURIER COEFFICIENTS. 


Thus we get 
sin + u) sin +), 
u/2 tog au + fF cot u/2 log ) 
(5. 8) + cot u/2 log du 
= K,+ K.+ Ks. 


Applying the transformation cot $4tan 4u—y to K,, and noticing that 


sind(a—u) 1—cot4rtan4u’ 


we have 


(5.9) (cot ie)? (y2 + (cot $2)2)) log ((1 + y)/(1—y)) dy 
< f ((1+9)/—y) ay. 


This integral is evidently convergent. 


By applying the same transformation to K2, we get 


2/ (1-tan? 2/2) 


(5.10) [og ((y + 1)/(y—1)) 
< fy Dog ((y + )/(y—1)) ay, 


which is again a convergent integral. 

Lastly making the change of variable sin $u = y sin $2 and noticing that 
sin + 2x) sin $(u— 2) = (sin 4ucos $x)? — (cos usin we reduce 
K; to the following form 


1/singz 1/sin 

(5. 11) K;= y log(y?—1)dyS log y*dy 
2cos 2/2 
1/sin 


< (2 log y/y) dy = log? (1/sin $2) 


= log? 1/z + O(log 1/z). 
Thus we have 
I=K,+ K.+ Ks S log? + O(log 1/z). 


Theorem 15 is thus proved. 


3 


281 

| 


282 CHING-TSUN LOO. 


I am grateful to Professor Zygmund for his suggestions and encourage- 
ment during the preparation of this paper. 


UNIVERSITY OF CHICAGO. 


REFERENCES. 


1. R. Bellman, “A note on a theorem of Hardy on Fourier constants,” Bulletin of 
American Mathematical Society, vol. 50 (1944), pp. 741-744. 

2. G. H. Hardy, “ Note on some points in the integral calculus, lxvi, The arithmetic 
mean of a Fourier constant,’ Messenger of Mathematics, vol. 58 (1928), 
pp. 50-52. 

3. T. Kawata, “ Notes on Fourier series (xii), On Fourier constants,” Proceedings of 
the Imperial Academy, Tokyo, vol. 20 (1944), pp. 218-222. 

4. G. I. Sunouchi, “On Fourier constants,’ Proceedings of the Imperial Academy, 
Tokyo, vol. 20 (1944), pp. 542-544. 

5. A. Zygmund, “Sur les fonctions conjuguées,’ Fundamenta Mathematicae, vol. 13 
(1929), pp. 284-303, Corrigenda, ibid., vol. 18 (1932), p. 312. 


, Trigonometrical series, Warszawa-Lwéw (1935). 


im 


enc 


tl 
al 
a 
a( 
fo 
co 
Z= 


ANALYTIC CONTINUATION OF FACTORIAL SERIES.* 


By V. F. Cow ine. 


The object of this paper is to extend to factorial series some results on 
the analytic continuation of Dirichlet series proved by the author. 

The fundamental results for factorial series are to be found in the books 
and papers of E. Landau [2], Milne-Thomson [3], N. Nielsen [4] and N. 
Noérlund [5]. A very complete and systematic study is to be found in Noérlund. 

We shall prove the following theorem. 


THEOREM. Let 


f(z) = S aan !/2(2 + 1)-- -(2+n) 


n=0 


with abscissa of convergence y< + 0. Let 1, be a given real number. Let 
I—-1 <<h wherel > | 1, | is integral and positive. Denote by A(h, yr, pz) 
a region of the w-plane w2S Arg(w—h) SW, where 0<WS2/2 and 
<0. 


Suppose there exists at least one set of real numbers h, 1, L, K, Ri, yw and 
Yo, where h, l, ¥; and W. are subject to the above restrictions, and a function 
a(w) satisfying the following conditions simultaneously. 


a. With the possible exception of the point at infinity a(w) is regular 
in the region A(h, Wz). 


b. a(n) n=1,1+1,-*-. 
ce. For w=h-+ in the region A(h, pe) 
a(h + Ret¥) = O[R exp(— LRsiny)], 
for R= R, and some 0< L < 2z. 


Then f(z) may be continued analytically over any closed bounded region 
contained in the half-plane R(z) >1, not containing any of the points 
so 0,—1,—2,-- -. 


Proof. Suppose then for a given function f(z) and a given real number /; 


* Received May 24, 1948. 
1See, Cowling, V. F. [1]. Numbers in brackets refer to the bibliography at the 
end of the paper. 


283 


= 


284 Vv. F. COWLING. 


there exists a set of constants h, k, L, Ri, y: and yw satisfying the conditions 
of the theorem. 

Denote respectively by J,, J, and c the sides of a contour ['(h, R’) bounded 
by the lines w = h + Re and w=h-+ Re‘ and arc py. S Arg(w —h) Sy, 
of a circle | w—h | = RP’ of integral radius R’. 

Let z be any complex number subject to the restriction that #(z) > l,. 
Then since by hypothesis 7 > |1, | it follows that none of the solutions in w 
of w+z2+1=—N, N=0,—1,—2,- - - are contained in a region bounded 
by I'(h, R’). Denote by B any closed bounded region contained in the half- 
plane #(z) > 1, which does not contain any of the points z = 0, —1,—2,---. 

It is then simple to show by means of the Calculus of Residues that 
(integration being understood in the positive sense) for z contained in the 
half-plane R(z) > 1, 


Sz +w-+1)[exp(2riw) — 1] 


T(h,R’) 


(1) 
Set 
(2) G(z,w) + w +1) [exp(2xiw) —1]}. 


We proceed to show that 
f G(2, w) dw 


converges uniformly to zero as R’ becomes finite for z on ‘a closed bounded 
portion of the real axis contained in the half-plane R(z) > h. 
On the are C 


(3) | exp (2riw) —1|-?—O(1). 
This follows from the periodic character of the function and the fact that w 


is bounded away from an integer by our choice of h, 1 and R’. 
From the identity 


1/{exp (2riw) — 1} = exp (— 2miw) /{1— exp (— 2ziw) } 
it follows that for w on C 


(4) | exp (2riw) —1|-* = O[exp sin y) ]. 


This inequality is useful for —7/2Sy < 0. 

By a well known result [4; p. 96] from the theory of the Gamma Func- [| 
tion, if we let z denote any complex number independent of w contained in a | 
closed bounded region, then we may write 


( 

t 

q 


ANALYTIC CONTINUATION OF FACTORIAL SERIES. 


(5) T(w)w*/T(w + 2) =1+ 


where ¢«(z,w) tends uniformly to zero as w tends to infinity in such a way 
as to be bounded away from the negative real axis. In the definition of w* 
we take that branch of log w which equals zero for w= 1. 

By application of condition (c) of the hypothesis and equations (3), 
(4) and (5) to equation (2) we obtain after simplification, for w on c, and 
z=r on an interval of the positive real axis contained in the half-plane 


R(z) l,, 
(6) G(z,w) = O[R*™ exp(— LR’ siny —r log R’ — rN 


= O[ exp( (22 — L) R’ siny —r log R’ 
—/2<y<0, 
where 
(7) N(R’,w) log (1 + h?/R”? + 2h cos p/P’). 


Clearly N(R’,y) converges uniformly to zero as R’ becomes infinite for 


Thus after some simplification it follows from (6) that 


(8) f. G(z,w)dw = O[R*?],. 


By inspection of the right hand member of (8) it is evident that the 
integral along c converges uniformly to zero as R’ becomes finite for z=r 
on any closed bounded portion of the positive real axis contained in the 
half-plane R(z) > 1, for which |z|=r2=k+2-+ 8, 8>0. 

Hence for zr on a closed bounded portion of the positive real axis con- 
tained in the half-plane 2(z) > 1, for which |z| =r = Max [k+2+8,y+8], 
0 we have 


where 


J (2.1) = G(z,w)dw and J(z, We) G(z, w)dw. 


We next consider under what restrictions on z, J(z,u¥,) and J(z,y.) 
converge uniformly. We omit the details in view of the fact that they are 
quite the same as in [1] at this stage. 


285 
1 
w 
: 
at 
). 

| 
af 


286 V. F. COWLING. 


By application of condition (c) of the hypothesis and equations (3) and 
(5) to equation (2) we find after some simplification that for z= re‘ con- 
tained in a closed region B 


(12) G(z,h + Res) dR exp (— LR sin dR]. 
Ry Ry, 


It follows immediately from (12) that J(z,y,) is uniformly convergent 
for z=re‘® contained in the region B. Hence by a well-known theorem 
[6; p. 99] it follows that J(z,y,) represents a regular function of z in any 
such closed region B. 

We consider next the uniform convergence of J(z,y2). Again it is 
simple to show that for z in the region B 


(13) f G(z,h + RIM) dR = O[ exp (# sin — L)) dR]. 
R, 


Since 0 < L < 2x by hypothesis and sin y2 is negative it is easily seen by 
inspection of the right hand member of (13) that J(z, ¥2) converges uniformly 
for z = re‘’ contained in a region B. As above it is then simple to demonstrate 
that J(z,y.) defines a regular function of z in a region B. 

As the region B may be chosen so as to include any bounded portion 
of the positive real axis for which | z | =r = Max [k+2+8,y+4,1,+ 8], 
§ > 0, in its interior, it is a consequence of (9) that 
provides the analytic contitnuation of 


over any closed region B. This completes the proof. 


LEHIGH UNIVERSITY. 


BIBLIOGRAPHY. 


1. V. F. Cowling, “Some results for Dirichlet series,” Duke Mathematical Journal, 
vol. 14, No. 4 (1947), pp. 907-912. 

2. E. Landau, “ Ueber die Grundlagen der Theorie der Fakultatenreihen,” Sitzungs- 
berichte der K6niglichen Bayerischen Akademie der Wissenschaften zu 
Miinchen, vol. 36 (1906), pp. 151-218. 

3. L. Milne-Thomson, The Calculus of Finite Differences, London (1933). 

. N. Nielsen, Handbuch der Theorie der Gammafunktion, Leipzig (1906). 

5. N. Norlund, Lecons sur les séries d’interpolation, Paris (1926). 

6. E. Titchmarsh, The Theory of Functions, Oxford University Press (1939). 


| 


1 
a 
q 
=: h 


REPRESENTATIONS OF SEQUENCES OF SETS.* 


By C. J. Everetr* and G. WHAPLEs. 


A class of sets is representable if there exists a choice function which 
assigns to each set in the class an element of that set, in such a way that 
the same element is not assigned to two different sets. To handle in a precise 
manner the more general—though scarcely more difficult—case in which 
the same set might be counted more than once, we define a sequence of sets 
to be a mapping of a set IT (whose elements y shall be called indices) into 
subsets M(y) of another set ZH (whose members shall be called elements), 
with no assumption that all subsets of H appear among the M(y) or that 
different indices correspond to different subsets. We denote such a sequence 
by (M(y),1T) or, if the mapping M is clear, by “sequence on IT.” By 
subsequences, intersections of two subsequences, etc., we mean sequences on 
subsets of the index set, on intersections of two subsets, etc., all with the 
same mapping M. A sequence is called representable if there exists a choice 
function f which assigns to each index y an element f(y) of the corresponding 
set M(y) such that the same element is never assigned to two different indices. 


1. Generalization of Hall’s theorem. A sequence shall be called finite 
if its set of indices is finite. For such sequences, P. Hall [2] has proved: 


THEOREM 1. A finite sequence (M(y),1T) 1s representable if and only 
if it satisfies the condition 


(H) For every integer k and every set % of k indices, the union of the sets 
M(c) with oe % contains at least k elements. 


For the convenience of the reader we insert a new and short proof.” 
(H) is clearly necessary for representability. If we introduce the notation 
# E for the cardinal of a set #, and M(%) for the union * of all sets M(c) 
with oe &, then (H) can be restated: 


* Received November 5, 1946; revised May 21, 1948. Presented to the American 
Mathematical Society, April 25, 1947. 

1Part of the work of this author was supported by the University of Wisconsin 
Alumni Research Foundation. 


? The basic idea of this proof was suggested by a conversation with P. Erdés; he has 
an independent proof of our Theorem 2, using transfinite induction and the above 
method of deletions. W. Gustin also contributed improvements. 

3 We shall use this abbreviation throughout the paper. A more common convention 
would have M(=Z) mean the class of all sets M(c), rather than their union; but we 
have no need of this class of sets. 


287 


| 


288 C. J. EVERETT AND G. WHAPLES. 


# = # & for every CT. 
Call the subsequence on & perfect if # M(3)=+#%. 


Lemma 1. The union and intersection of any two perfect subsequences 
of a sequence which satisfies (H) are perfect. 


Proof. Let the sequences on 3, and %, be perfect. By (H) we have 
already : 
# M(3, U = # (3: U 32). 
Also, 
= (=, U 2:2) =#2+#2.—# (3: 32) 
and 
# M(3, = + # M(32) —# M(3, 1) 2) 


Since, by (H), =# (3:1 32), we get #M(3: 
<= # (3: U 3:), and the lemma follows. 


Proof of theorem. It is trivial for all sequences with # T—1; assume 
it for all sequences with less than n indices and suppose that (M(y),T) 
satisfies (H) and that [= {1,2,---,m}. For each element e of the set 
M(n), consider the sequence obtained by deleting e (if it occurs) from each 
M(y) and letting y run over IY = {1,2,---,(n—1)}. If, for any such e, 
the corresponding sequence should satisfy (H), then it would-be representable 
and we could get a representation of the original sequence by assigning to n 
this element e. Suppose then that none of these new sequences satisfies (H). 
Then there must exist, for each ee M(n), a set of indices %, such that: 


Se CI’, hence and # (M(3-) — {e}) <# >. 


Since deletion of e can change the cardinal of M(3%-) by at most 1, and 
since the original sequence satisfied (H), we see from the second relation 
that M(%,.) always contains e and that the sequences (M(o), %¢) are perfect. 
Let 3’ be the union of the 3%; then the sequence on %&’ is perfect and 
M(3’) > M(n). Hence the set %, obtained by adjoining n to the set ¥, 
contradicts (H). (#M(3)=#M(>’) = # 3-1.) 

Hall’s theorem would be false if the restriction to finite index sets were 
dropped. To see this, let T be the set of all positivé integers together with 
the symbol o, and define M(n) = {n}, M(w) = set of all positive integers. 
Then (H) is true, even for infinite index sets. But one sees easily that no 
number will work as the representative for w, so this sequence is not repre- 


REPRESENTATIONS OF SEQUENCES OF SETS. 289 


sentable. In view of our Theorem 2’, we point out that it has the further 
property that every finite subsequence is representable. 

We prove however the following generalization of Hall’s theorem, which 
has no restriction on the cardinal of the index set: 4 


THEOREM 2. Jf (M(y),T) ts any sequence of finite sets (t.¢. if each 
M(y) ts finite) then it is representable if and only if #M(3%) =#*%& for 
every finite subset & of T. 


In view of Hall’s result, this is equivalent to: 


THEOREM 2’. A sequence of finite sets is representable if and only tf 
every finite subsequence is representable. 


Proof. Let (M(y),T) be a sequence of finite sets, with every finite 
subsequence representable. A function f, defined on a subset A of the index 
set to the set H, shall be called a universal representation function (u.r. f.) 
if every finite subsequence has a representation which agrees with f for all 
indices for which f is defined—i.e. if each (M(o),%) with finite = has a 
representation r with r(o) =f(c) on 3{) A. (This implies, of course, that 
f(8) e M(8)—consider the subsequence on the single index 6.) 

If g denotes the null set, and * the “ function whose range of definition 
is #,” then * shall be considered as a function, and hence a u.r.f. The fact 
that it is one is not a complete triviality, since it demands that every finite 
subsequence be representable—viz., the main assumption of the theorem. 

We prove our theorem by showing that there is a u.r. f. defined over the 
whole set T. Our proof uses the lemma of Zorn: Any non-empty partially 
ordered set, in which each linearly ordered subset has an upper bound, con- 
tains a maximal element [9]. We partially order the set of u. r. f. by defining 
one of them to be less than a second if the range of definition of the second 
includes that of the first, and if they agree whenever both are defined. 


LemMa 2. Every linearly ordered set of u.r.f. has an upper bound. 


(This lemma needs no assumption about the sequence.) Let & be a 
linearly ordered set of u.r.f., A the set of all indices for which any one of 
them is defined: we must show that there is a u.r. f. defined on all of A and 
agreeing with every function in &. How to define this function is clear: 


*R. Rado, [4], a paper which was called to our attention by the referee, gives an 
entirely different generalization. He keeps the index set finite (except for one unproved 
remark) but replaces the cardinal numbers of the sets by a general metric, and extends 
the idea of a representation. 


| 


290 C. J. EVERETT AND G. WHAPLES. 


if Ae A, define 1(A) to be f(A), where f is any function in & which is defined 
for A. One sees that there is at least one such f, and that 1(A) does not 
depend on which one we choose. If & is any finite index set, there is some f 
in & which is defined for all indices in %: since f is a u.r.f., } has a repre- 
sentation which agrees with f and hence also with J. 


Lemma 3. Let f beau.r.f., with range of definition A, for a sequence 
(M(y),T) of finite sets; and let + be any index not in A. Then f can be 
extended to a u.r.f whose range of definition is A {r}. 


Suppose the lemma false. Then for every element e of the finite set 
M(r), the extension of f obtained by assigning to 7 the representative e fails 
to be a u.r.f. Then there exists for each ee M(r) a finite set of indices &,, 
such that re, and every representation of the sequence on Xe either dis- 
agrees with f or assigns to 7 a representative different from e. For each 
ee M(r) choose one such &, and let & be their union. Since M(r) was finite, 
> is finite. Any representation of (M(o),%) would assign to + some repre- 
sentative ee M(7), would induce a representation of the subsequence on  ,, 
and hence would disagree with f. This contradicts the assumption that f 
was a u.r.f., and proves the lemma. 

These two lemmas, together with the lemma of Zorn, clearly imply 
Theorem 2. Indeed, if we consider the set of all u. r. f. which exceed a given 
one, we get the somewhat stronger proposition: Any u.r.f. of a sequence 
of finite sets can be extended to a representation of the whole sequence. 


2. Representations relative to a partition. A partition of a set EF is 
a class ®=—{p} of disjoint subsets of HL, whose union is the whole set. 
(Nature of our proofs will justify our choice of notation.) A sequence 
(M(y),1T) of subsets of # has a representation relative to $% (abbreviated to 
§-representation) if there is a function f which, in addition to being a repre- 
sentation, satisfies the condition that representatives of different indices are 
contained in different elements of (a f(a) ea, eb implies a 5D). 

A subset of Z shall be called finite with respect to % (-finite) when 
it meets only a finite number of elements of $$. A class or sequence of sets 


is $-finite when all its members are. 


THEOREM 3. Jf $ is a partition of E, and if (M(y),T) ts a $-finite 
sequence of subsets of E, this sequence has a $-representation if and only 1f, 
for every finite index set X, the number of pe YS which meet M(%) is at least 
equal to the number of indices in &. 


r 
re 
hi 
de 
de 

th 
di 

be 


REPRESENTATIONS OF SEQUENCES OF SETS. 291 


Proof. (Method due to P. Hall [2]). For each yeT, define the subset 
M(yv) of PB to be the set of all pe $8 which contain elements of M(y). Since 
each M(y) is $-finite, the Mt(y) are finite sets. The number of elements 
(of $B!) in Mt(S) is at least #3, for each finite 3. So by Theorem 2, 
sequence (Mt(y), 1) has a representation f, mapping T into %. Any f which 
maps I into # in such a way that f(y) e f(y) for all y (axiom of choice) is 


a $-representation. 


3. de Bruijn’s theorem. Suppose £ has two partitions, 2% —= {a} and 
B® = {bh}. We shall say that 9 and 8 have a common representation when 
there exists a one-one correspondence amb between % and 8 for which 
corresponding a’s and b’s have non-empty intersections, together with func- 
tions f defined on and g on such that f(a) whenever 
a~ b. 

We can prove very easily the following theorem, which was first proved 
by de Bruijn (although our work was independent of his). 


THEOREM 4. Jf and B are two partitions of a set H, and tf each of 
them is finite with respect to the other, then they have a common represen- 
tation if and only if, for all integers k, the union of each k elements ae % 
meets at least k elements of 8, and also the union of each k elements be B 


meets at least k elements of QF. 


Proof. Apply Theorem 2, with 9 as the class of sets and % the partition : 
% has a representation relative to 8. Similarly, 8 has a representation 
relative to 9. We only need further: 


Lemma 4. If Y and 8 are partitions of a set, if XM has a representation 
relative to 8, and if B has a representation relative to UA, then A and B 


have a common representation. 


Proof. Let f be the representation of 2 relative to B. Then if f(a) 
denotes that element of partition 8 which contains f(a), this function * 
defines a one-one correspondence between % and a subset of 8. In just the 
same way we get a one-one function g mapping % onto a subset of %. A 
theorem of Banach [5, p. 90] asserts that in just this situation there exist 
disjoint splittings and B= B, such that f gives a one-one 
correspondence between %, and 8, and g a one-one correspondence between 
%, and %,. This correspondence, together with the function h, defined to 
be f on %, and gg™? on %., gives the desired representation. 


292 C. J. EVERETT AND G. WHAPLES. 


CoroLiary 4.1. Jf Y and B are two partitions of E for which all a’s 
and all b’s have the same finite number h of elements, then A and B have 


a common representation. 


For the union of k members of % contains kh elements of E, hence meets 


at least k elements of 8. 


CoroutLary 4.2. The system of left and right cosets of a finite sub- 
group of any group have a common representation. 


Konig and Valké [3] proved 4.1. Van der Waerden [7] used their 
result to establish 4.2, and showed that this corollary can be false for an 
infinite subgroup of infinite index, but proved it (using group theory and 
Theorem 4 applied to finite partitions) for all subgroups of finite index. 
A proof of this case of the corollary can also be found in Zassenhaus, [8, 
pp. 11 and 35]. j 


4. Concluding remarks. The full power of Theorem 1 is not needed 
to prove Theorem 4; de Bruijn uses the following trick, similar to one of 
Konig, for reducing this theorem to the denumerable case. Call ae % and 
be B immediately connected if their intersection is non-null, and any two 
elements of % J 8 connected if there exists a finite chain of immediately 
connected elements leading from one to the other: one sees at once that 
and % fall into denumerable, connected components and that if %’, B’ are 
components of and then the sets and are’either disjoint 
or equal. 

So only the denumerable case of Theorem 2 is needed for de Bruijn’s 
Theorem. One may then avoid the theorem of Banach, also, by using an 
alternating process to build the common representation so as to insure that 
each ae Mf and each be B has a representative, using Lemma 3 at each step. 

No similar reduction of Theorem 1 to the denumerable case seems 
possible: and we would conjecture that the axiom of choice (or lemma of 
Zorn) for arbitrary cardinals is really necessary for proof of this theorem. 
Certainly no such simple decomposition as de Bruijn uses is available in 
this case. For example, let (2, y) —z be a one-one correspondence of points 
of the unit square to points of unit interval, and consider the class of sets 
(of one, two, or three elements) {z, y,z} where (ry) 2. It is nondenum- 
erable and has a representation (take z as the representative), but every 
set is connected with every other set. 

Theorem 2 can be proved much more simply in the denumerable case. 
Let T be the set of positive integers, suppose (M(y),IT) satisfies the assump- 


REPRESENTATIONS OF SEQUENCES OF SETS. 293 


tions of Theorem 2, and consider all functions f, defined on I, with 
f(v) e M(v) for each ve T. Let A, be the set of all such functions for which 
the values f(1),f(2),-- -,f(m) are all distinct. We have to show that the 
intersection of all the sets A, is not empty. If we make the set of all these 
functions f into a metric space by defining p(f,g) =1/n where n is the 
smallest integer v for which f(v) #g(v) [6, p. 74], then the A, are closed 
and compact, hence have a non-null intersection by the Cantor theorem 
[6, p. 30]. One could also show this fact by use of the Cantor diagonal 
process. 


UNIVERSITY OF WISCONSIN AND INDIANA UNIVERSITY. 


BIBLIOGRAPHY. 


1, N. G. de Bruijn, “ Gemeenschappelijke representantensystemen van twee klassen- 
indeelingen van een verzameling,” Niew Archiev Wiskunde (2), vol. 22 
(1943), pp. 48-52. 

2. P. Hall, “On representation of subsets,” Journal of the London Mathematical 
Society, vol. 10 (1935), pp. 26-30. 

3. D. Kénig and 8S. Valk6, “Uber mehrdeutige Abbildungen von Mengen,” Mathe- 
matische Annalen, vol. 95 (1925), pp. 135-138. 


4. R. Rado, “A theorem on general measure functions,” Proceedings of .the London 
Mathematical Society, vol. 44 (1938), pp. 61-91. 

5. W. Sierpinski, Legons sur les nombres transfinis, Paris, 1928. 

6. , Introduction to General Topology, University of Toronto Press, 1934. 

7. B. L. van der Waerden, “ Ein Satz iiber Klasseneinteilungen von endlichen Mengen,” 


Hamburger Abhandlungen, vol. 5 (1927), pp. 185-188. 
8. H. Zassenhaus, Lehrbuch der Gruppentheorie, Berlin, 1937. 
9. M. Zorn, “A remark on method in transfinite algebra,” Bulletin of the American 
Mathematical Society, vol. 41 (1935), pp. 667-670. 


METRIC PROPERTIES OF DIFFERENTIAL EQUATIONS.* 


By D. C. Lewis. 


1. Introduction. The purpose of this paper is to discuss the depen- 
dence of solutions of a system of differential equations, 


(1.1) —X*(2x,,- - 2»,t) —X*[2, t], 


on the initial conditions. The classical existence proofs yield inequalities 
which show that the solutions depend continuously on the initial conditions, 
at least, if the right hand members satisfy a Lipschitz condition. In this 
paper we shall refine and supplement these classical inequalities, assuming 
that the X# are of class C’. In so doing we call attention to the fundamental 
importance of a certain quadratic form @Q to be introduced in the sequel. 
Still further, but less useful, refinements may be obtained on the assumption 
that the XY‘ are of class C”, in which case a certain biquadratic form B plays 
the decisive role. It is also indicated, by consideration of simple examples, 
that our fundamental theorems are the best possible ones of their type. A 
simple application to the theory of qualitative integration is given. 

Our theorems are specified in terms of a metric which we assign to the 
space of the (z,,---+,2@n). For most purposes it is sufficient to assume 4 
Riemannian space. Much of the analysis, however, is not more difficult, if 
we assume a general Finsler space. 

Let =f(a1,° be of class C’” and positively 
homogeneous of degree + 1 in %,° - -,@n.. Furthermore, suppose f[2z, z] > 0, 
unless all the z’s are zero, and that fz,;,[7, > 0 for all sets At,- A", 
not proportional to 2,,---+,%n.1 Here, as in the sequel, the summation 
convention of tensor analysis is used, with all repeated indices summed from 1 
to n. The length of an arc, x =2i(r), i=1,- n, is defined 


1 
as the value of the integral, f fla(r), 2’(r) ]dr, which exists if the arc is 
0 


sufficiently regular and is, moreover, independent of the parametrization on 


* Received May 11, 1948. 

1The purpose of this condition is to insure the “regularity ” of the variational 
problem 6 f f{#,<]dr = 0 in parametric form. Cf., for instance, Marston Morse, “ The 
calculus of variations in the large,” American Mathematical Society Colloquium Pub- 


lications, vol. 18 (1934), p. 121. 
294 


| 
| 

| f 

9 

| 

| 

t 

in 
| 
ne 
co 


e 


METRIC PROPERTIES OF DIFFERENTIAL EQUATIONS. 295 


account of the assumed homogeneity of f[z,#]. The distance between two 
points is the length of the shortest arc connecting the two points. The 
parametric equations of such a geodesical arc are well known to satisfy 
Kuler’s equations, 


(1. 2) fe,[a(r), v’(r)] — (d/dr) fe,[u(r), 2’(r)] =0, 


2. Fundamental results on the hypothesis that the X‘ are of class C’. 
THEOREM 1. Let us consider two integral arcs of the system (1.1). 
Co: =m ay(t,0) and Ci: —=2,(t,1), 


We let D(t) denote the distance between [a(t,0)] and [x(t,1)] measured 
along the geodesical arc gt. We furthermore assume that all these g: are 
included within an open region R, and that no complete geodesic obtained 
by extending a g: contains entirely within R an arc whose end points are 
conjugate to each other (in the sense of the calculus of variations). 


If 
(2. 1) a(t) ={f.,[2, X*[a, t] + fa, [a, X*,,[2, = B(t) 


for every set A1,---,r" such that fla,rA] =1 and at every point [x] on 
gt, then 


t 
(2. 2) D(to) exp f a(u)duS D(t) S D(t,) exp f B(u) du, 
to to 


Before proceeding with the proof of this theorem we note that in the 
important Riemannian case, when 


the middle member of (2.1) can be replaced by a very simple quadratic form 
in +,A", namely, 

(2. 4) Q[LA] = = ($.X" + GinX*e,) 

Here the \’s are components of a unit vector, inasmuch as f[z,A] —1 can 
now be written in the form gijA‘A7=1. In formula (2.4), Xi; is the 


covariant derivative of the covariant vector X; = gixX*. 
To prove the theorem, we let 


(2. 5) == 2;(t, 


1 

| 

y | 

(2.3) fla, = 

- | 


296 D. C. LEWIS. 


be the coordinates of the point on g; which divides g; into two arcs of shorter 
length in the ratio r:(1—~7) from [2(t,0)] to [v(t,1)]. The functions 
xi(t,r) may be shown to be of class C”. It is here that the hypothesis 
concerning the non-existence of certain pairs of conjugate points in F is used. 
By definition of D(t) we have 


(2.6) D(t) = dr, 


which we differentiate under the integral sign, thus obtaining 


D'(t) = in tr] + fz,[, ) dr. 


The second term in the integrand can be integrated by parts, and, in a 
manner familiar to all devotees of the calculus of variations, we find that 


+ 1), 1) 1) — 0), 0) (t, 0) /at}. 


In virtue of (1.2), which the 2;(t,r) considered as functions of r are 
known to satisfy, the integral in the above representation of D’(t) disappears 
at once. Remembering that the curves Cy and C, are integral curves of (1.1), 
the rest of the expression can also be simplified, with the result that 


(2.7%) D'(t) = {fz.[x(t, 1), 1) 1), 
— fi,[x(t, 0), 0) ]X*[x(t, 0), t]} 
J(t,1) —J(t,0), 


where we define J as follows: 
J (t, 7) = fi,[x(t, T), 7) 
Differentiating this last equality and again using (1.2), we obtain 


(2. 8) J-(t, = fe (a(t, ,(t, t] 
+ fi,[a(t, 7), v(t, 7) 7), 


We now introduce the new variable s defined by 


(2.9) Stet, 2”) Jar’ 


W 

( 

( 

( 

T 

as 

7’ 

A 
In 
th 
(2 
Tl 
in 
ele 
ar 
or 
(2 
(2 
wh 
(2 
uni 
the 


METRIC PROPERTIES OF DIFFERENTIAL EQUATIONS. 


which has the obvious properties: 


(2. 10) 0s/dr = f[ x(t, 7), t)], 
(2. 11) s(t,1) = 
(2. 12) f[ 2, dx/ds] = 1. 


The last of these relations follows from the fact that, on account of the 
assumed homogeneity of f[z,#], the parameter in (2.9) can be changed from 
7 to s’, where s’ =s(t,r), with the result, deduced from (2.9), that 


s— f Jas’. 


Again using the homogeneity property of f[z,¢] and the relations 
= (0s/0r] we obtain from (2.8) the result that 


Jr(t,7) = [fo s), X*‘[a(t, s), t] 
+ ve] (0X*/62;) (025/08) rr]. 
From this we find with the aid of (2.1) and (2.12) that 


Integrating from 7 = 0 to r 1 and referring to (2.6) and (2.7), we find 
that 
(2.13) a(t)D(t) S D’(t) SB(t)D(t). 


The proof of the theorem is completed by integration of these last two 
inequalities. 

The theorem just proved yields at once a result more reminiscent of the 
classical inequalities, if we take for a(t) and @(t) constants & and # which 
are respectively lower and upper bounds in R for the middle member of (2. 1) 
or (in the case of a Riemannian metric) for the quadratic form Q given by 
(2.4). We thus obtain the result 


(2. 14) D (ty) e#(t-to) < D(t) S D(ty) St, 
which may be compared with such a classical formula? as 


(2. 15) D(t) D(to) eM (eto), 


*Cf., for instance, formula (2), p. 43, of Bieberbach’s textbook, Differentialgleich- 
ungen, where a different notation is used in connection with a first order system and 
the usual Euclidean metric. 


4 


297 
| 


298 D. C. LEWIS. 


Here M is the positive constant appearing in the Lipschitz condition (or some 
positive constant multiple thereof in the more complicated cases). In other 
words, if the X’s are of class C’ and if we neglect a possible positive constant 
factor, M is an upper bound for the | 0X‘/dx;|. As we shall show in 6, 
there are cases when Q is negative definite and hence 8 may sometimes be 
taken to be zero or even negative, whereas (except for the trivial case when 
the X’s are independent of 2,- - -,2n), M must be positive. In this sense, 
at least, (2.14) is an essential refinement of (2.15). 


THEOREM 2. Let Cy and C, be defined as in the preceding theorem. 
Let T;, be an arbitrary sufficiently regular curve, with parametric equations, 
&(r), i= 1,- OS 71, whose end points are at [x(to,0)] and 
[x(to,1)] respectively. Let C, be the curve whose parametric equations, 
x= 2; (t,r), satisfy the system (1.1) for each fixed value of + on the unit 
interval and take on the initial values x;(to,r) =& (7). Let T; be the arc 
whose parametric equations, for each fixed t, (th StSty+h), are likewise 
Let L(t) denote the length of 


If 
(2.16) a*(t) t] + 1} S B*(t), 


for every set of the X’s, such that f[x,A] =1 and at every point [a] of Ty, 
then 


(2. 17) L (to) exp = L(t) S L(to) exp 
to to 


To prove this, we observe that by our definition of length 


(2. 18) L(t) flee r), Jdr. 


Differentiating under the integral sign, we have (after making use of (1.1)) | 


(2. 19) — 2), t), ¢] 
+ 7) |X [a(t, 7), dr. 


As in the preceding proof, we introduce the arc length s as a new parameter. | 


fle, 02/08] = 1. 


Again using the homogeneity of f[2, xz] and hence the same homogeneity of | 
the integrand in (2.19), we find that 


t 
t] 
3 ge 
Sys 


of 


METRIC PROPERTIES OF DIFFERENTIAL EQUATIONS. 299 


(2.20) L’(t)= Siu. 0x/0s|X*[ a, t] + fz,[a, 0x/ds|X*,,[a, t]0x;/0s}ds. 
It now follows at once from (2.16) that 

(2. 21) a*(t)L(t) SL(t) SB*(t)L (2), 

and hence the proof is completed by integration of (2.21). 


The relationship between Theorem 1 and Theorem 2 is made sufficiently 
obvious by the following remark: 


Although T; is not, in general, a geodesic for all values of ¢ (or even 
for any value of ¢), it is possible to choose T;, in such a way that for some 
particular ¢, T; is a geodesic. For such a I; it is clear that 


D(t) = L(t) 
whereas 
D(t+ At) SL(t+ At). 
Hence 
{D(t + At) — D(t)}/At S {L(t + At) — L(t)}/At, if At > 0, 
while 
{D(t + At) — D(t)}/At = {L(t + At) — L(t)}/At, if At < 0. 


Hence, allowing Aé to approach zero first through positive values and then 
through negative values, we find that D’(t) = L’(t). Thus the inequalities 
(2.13) used in the proof of Theorem 1 can be deduced from the corresponding 
inequalities (2.21) used in the proof of Theorem 2. The indicated alterna- 
tive in the proof Theorem 1 was not adopted because of the essential impor- 
tance of formula (2.7) in 3. 


THEOREM 3. Let Co, C;, D(t), a(t) be defined as in Theorem 1. Let 
the T;, introduced in Theorem 2 be chosen as a geodesic, and then let L(t) 
and B*(t) be defined as in Theorem 2. Then 


t t 
0S L(t) —D(t) S D(to) [exp f B*(u)du— exp f a(u)du], 
to to 
This theorem, which is a trivial consequence of L(t)) = D(t,) and the 
two previous theorems, gives some information about the manner in which 


geodesics are deformed under the transformations defined by the differential 
system (1.1). 


e 
or 
it 
n 
s, 
ud 
8, 
it 
rc 
se 
| 
| 


300 D. C. LEWIS 


3. Estimates for D’(t) and L’”(t) on the hypothesis that the X‘ are 
of class C”. So far we have not made use of the particular manner in which 
the parameter + was chosen in the proof of Theorem 1. Actually it was chosen 
in such a manner that 


whence, differentiating with respect to 7, we have 
(3.1) D(t) = flx(t,r), 7) ] 
(3. 2) 0 = fe, + fe, 


On account of the assumptions made in 1 on f[z, x], it may be shown that, 
although the n equations (1.2) are not independent, these equations taken 


together with (3.2) can be solved uniquely * for 6?2;/dr? in terms +, 
and - -,02%n/0r. We accordingly write 
(3. 3) = 


It is easy to verify that the functions F‘[z,z], here introduced, must be 
homogeneous of degree 2 in %,,--*,%n. It is perhaps also worthwhile to 
record the fact that these F’s are determined by the following system of n | 
linear equations: 


(3.4) + fedt,)F! — — + yn. 
We introduce the abbreviations, 
(3. 5) Plt, x, = — — 4(X*X*) 
+ + fot, ate + 
+ fi a, + X te 
+ fe + X41}, 


* These are essentially well known facts in the Calculus of Variations. It is also 
easily shown that a set of m independent equations equivalent to the nm + 1 equations 
(1.2) and (3.2) may be written 


[a®(f) /dx,] — [08 (f) = 0, 


where ® is an arbitrary function of class C” whose first and second derivatives are positive 
for positive values of its argument. Formula (3.4) was obtained by taking @(f) = f?. 
If we set 94; @) +f Gi = — 3Fi, etc., the relation (3.4) is seen to 
have obvious connections with certain equations given by E. Cartan, “Les espaces de 
Finsler,” Actualités scientifiques et industrielles, 79, Exposés de Géométrie, II, pp. 16 
and 17. 


. 


METRIC PROPERTIES OF DIFFERENTIAL EQUATIONS. 


and 
+ X*) 2, + + fa + X4}, 


so that the following theorems may be stated in reasonably compact form. 


THEOREM 4. With the same assumptions as those of Theorem 1 
regarding Co, C;, D(t), gt, we assume in addition that 


(3. 7) ¢(t) S Plt, Sy(t) 


uw every point [x] on g: and for every set A1,- - -,r" such that f[z,rA] —1, 
then 


(3. 8) $(t)D(t) SD’(t) S y(t) D(t). 
THEOREM 5. With the same assumptions as those of Theorem 2 
regarding Co, C1, T:, L(t), we assume in addition that 


(3.9) o*(t) SP*[t, 2,0] Sy*(t) 


Pa at every point [x] on T; and for every set A1,- - -,A" such that f[z,A] —1, 
then 


(3. 10) o*(t) L(t) S Sy*(t) L(t). 
In the Riemannian case we may write, 


(3. 11) BlA] = [fla, aA] A] = Yajur 


B*[A] = [f[2, A] ]8P*[t, 2, A] = (a, t) 


These biquadratic forms, in their relation to the appraisal of the second 
derivatives of D(t) and L(t) are analogous to the quadratic form Q[A] 
given by (2.4) in its relation to the appraisal of the first derivatives. There 
seems, however, to be no particularly simple interpretation for the coefficients 
Yijxr or Y*ijx2 such as we had for the Xj; in (2.4) as the covariant 
derivative of X;. 

To prove Theorem 4, differentiate (2.7) with respect to ¢ and remember 
that 2; = ai (t,r), i=1,:--,n, for and 1 satisfy (1.1). We thus 
find that 


(3. 12) D’ (t) = K(t,1) —K(t, 0) — t) dr, 


301 


302 D. C. LEWIS. 


where 
K (t, T) fz.2,[z(t, tT), x,(t, tT) ]Xi,,[a(t, tT), t] X'[a(t, tT), t ]02%./0r 


Now differentiating this last equality with respect to 7 and using both (1. 2) 
and (3.3) as well as the easily proved relations, 


we obtain 


K,(t, 7) = Pit, r(t,7), t)]. 


Now, since P[t,2,Z] is homogeneous of degree + 1, we may introduce the 
parameter s as in the proof of Theorem 1, thus obtaining 


K,(t,r) = P[t, z(t, s), vs(t, s) 7), 27(t,8)], s), ae(t, ] =1. 
From this, we find with the aid of (3.8) and (2.12), that 
$(t)f[2, SK,(t,r) 27]. 


Integrating from +r = 0 to r= 1 and referring to (2.6) and (3.12), we find 
that (3.8) has been proved, as desired. 


~ 


To prove Theorem 5, we differentiate (2.19) under the integral sign 
and make use of (1.1). We thus obtain 


Since P*[t,z,z] is homogeneous of degree 1 in the 2’s, we have, upon 
introducing the arc length as a new parameter, the following formula: 


*L(t) 
L(t) = P*[t, x(t, s), a(t, 8) ]ds. 


It now follows immediately from (3.9) that (3.10) is true, thus completing 
the proof. 
It is interesting to notice the difference between the two forms P and P*: 


+ fast, {X ta, XI — 


| 
| 
14 
| 
¢ 


le 


METRIC PROPERTIES OF DIFFERENTIAL EQUATIONS. 303 


Upon rearranging the terms and writing 
(3. 13) Si = — X* — + 
we thus obtain 


(3. 14) P— 


4, Special case of a Riemann metric with constant coefficients. For 
most purposes it is sufficient to take a Riemann metric with constant 
coefficients. If this is done, practically all formulas are enormously sim- 
plified. In this section we therefore assume that f? = g;;a;%;, where the g’s 
are constants. From (2.4), we have 


(4. 1) Q[A] = 


Since f is independent of the undotted 2’s, we have fr, =fe,2,=fe,=0, 


so that (3.4) yields F'=0. From (3.5), we now obtain 


+ JapIJin ) 
+ JapGinX 

From (3.13), we have 


(4. 3) = (GapGii — 9aiG Bi) 


Still further specializing to the Euclidean case we obtain 


(4. 1E) Q[A] = 


+ + 
(4. 3E) = 2,2, — Xb j 

5. Second order linear systems. The simplest nontrivial system to 
which we may apply the above theory is the second order linear system, 
(5.1) dx/dt = A(t)x+ B(t)y, dy/dt=—C(t)x+ D(t)y, 
which we shall consider in connection with the metric, 


(5. 2) ds? = Edx* + 2Fdxdy + Gdy’, EG —F?>0, E> 0. 


304 D. C. LEWIS. 


Here F, G are constants. Thus, taking =F, gi2 =F, goo = G, =), 
=p, = 2, y, the fundamental quadratic form Q may be computed 
from (4.1). 


(5.3) (BE + DF + AF+CG)Ap-+ (BF + DG)p2. 


The extreme values of Q under the condition, HA? + 2FAp + Gy? = 1, that 
A and p» be the components of a unit vector, can be found by the method of 
Lagrange’s multipliers. The maximum, A(t), and the minimum, «(¢), of 
Q[A] are thus seen to be the roots of the following quadratic equation in s: 


(5.4) s?—(4+D)s+ 4AD—BC 
— — AF + DF —CG@)?/(EG—F*) =0. 


It is interesting to notice that the sum of the roots of this quadratic is the 
same as the sum of the roots of the equation, 


(5.5) (4+ D)s+ AD— BC =0, 


which, in the case of constant coefficients, A, B, C, D, is commonly called 
the characteristic equation. We shall call it the “characteristic equation,” 
even when A, B, C, D are not constants. A most favorable metric (for some 
fixed value of ¢ in the non-constant case) is defined as a metric which makes 
B—aaminimum. Since «+ 8—4A-+ D is independent of the metric, 
it is clear that a most favorable choice also makes 8 a minimum and «@ a 
maximum. We summarize our results on most favorable metrics in the 
following three theorems: 


THEOREM 6. If 
(5. 6) (A + D)?— 4(AD— BC) = (A— D)?+ 4BC > 0, 


a most favorable metric gives to « and B the values of the roots of the charac- 
teristic equation (5.5). There exist infinitely many essentially distinct * 
‘most favorable metrics. 


Proof. Since (8 — )? is to be a minimum, we must choose F, F, and G 
‘so as to minimize the discriminant of (5.4). But the discriminant of (5.4) 


4Two metrics, Edx? + 2Fdady + Gdy? and E’dx*® + 2F’dady + G’dy?, are for the 
purposes of Theorems 6 and 7 regarded as essentially the same, if, and only if, 
B/E’ = F/F’ = G/@’. 


| 


of 
of 


he 


the 


METRIC PROPERTIES OF DIFFERENTIAL EQUATIONS. 305 


is the same as the discriminant of (5.5) augmented by the non-negative 
quantity, 
(BE — AF + DF —CG)?/(EG — 


Hence the required minimization is obtained by choosing H, F, and G, so that 


§ BE—AF+ DF—CG=0 
( EG—F>0, E> 0. 


If BC>0, we satisfy (5.7) by taking F—|C|, F=0, G=—|B|. 
If BC <0, we take H=+(A—D)C, F==+ 2BC, G=x+ (D—A)B, 
where + is chosen so that +(A—D)C>0. This makes HG — F? 
= — BC[(A — D)? + 4BC], which in view of (5. 6) must be positive in present 
circumstances. If B= 0 and C0, take = 1, F =+C,G=+ (D—A) 
>0. Similarly, if B0 and C=0. Finally, if B=C—0, take = G=—1 
and #0. From considerations of continuity, it is clear that in each of 
these cases there are infinitely many other choices. 


(5.7) 


THEOREM If 
(5. 8) (A + D)?—4(AD— BC) = (A—D)*+ 4BC < 0, 


there exists an essentially unique most favorable metric, making «a=—B = 
real part of one of the conjugate complex roots of the characteristic equation. 


Proof. In view of the known relationship between the roots of (5.5) 
and (5.4), it is sufficient to show that H#, F, and G can be chosen in essen- 
tially just one way in such wise that (5.4) have a double root. Equating 
to zero the discriminant of (5.4) and making some algebraic manipula- 
tions, we find that H, F, and G must satisfy the relation, 


(5.9)  (BE+CG)?— (2BF + DG— AG) + AE — DE) =0, 


if a and B are to be equal. While there are obviously an infinite number 
of ways of satisfying this relation, it turns out that there is (neglecting a 
common factor) only one real set of values for L, F, G, namely 


(5. 10) Ex+C>0, F=+}(D—A), G = — B, 
which satisfies (5.9). 


In fact, if we set p=2BF + DG—AG and q=2CF+ AE— DE, 
we find at once that 


(5. 11) (BE + CG)(D—A) =Cp— Bq. 


| 
ed 

e 

ces 

ic, 

ct* § 

d G 

4) 


306 D. C. LEWIS. 


It follows from (5.9) that p and gq must satisfy the relation 
(5. 12) (Cp — Bq)* — (D— A)*pq = 0. 


The discriminant of this quadratic form in p and q is found to be 
(A — D)?[4BC + (A—D)?], which, because of (5.8), is surely negative 
in case A54D. Since the left hand member of (5.12) is therefore positive 
definite, the only real values for p and g which can satisfy (5.12) are 
p=q=0. The stated result follows at once, in case AD. In the con- 
trary case, A = D, we are justified, in view of (5.11) and the fact deduced 
from (5.8) that BC=40, in writing p=kB and q=—kC. Thus (5.9) 
becomes (BE + CG)*=k?BC. The left hand side of this relation is non- 
negative, while, in view of (5.8), the right hand side is non-positive. Hence 
k =0, and we are led to p=q = 0, as before, as well as to BE + CG =0, 
from which the proof is readily completed. 


THEOREM 8. /[f 
(5. 13) (A + D)?— 4(AD— BC) = (A — D)* + = 0, 
there is no most favorable metric, except when B=C=0. But, for any 
number « > 0, a metric can always be found such that D)—e<a 
<}(A+D) <3(4+D) +e 

Proof. We prove the second statement first. If AD, then BC < 0. 
Then, if we take = + (A—D)C>0, F==+ 2B00, G=+ (D—A)B, 
0 < 6 <1, we find that 


EG — F? =— BO(A D)? — 4B°0°6 
— — BC[(A — D)? + 4BC] — BC[4BC (6? — 1)], 


which, on account of (5.13) reduces to 4B?0?(1 — 6?) > 0, so that EG — F° 
> 0, as required. We also find that equation (5.4) becomes 


(A+ D)s + + D)? —4[(A—D)*(1— 8) ]/(1 + 6) = 0. 


Since 0<6@< 1, the roots of this equation are real; and, by taking 9 
sufficiently close to 1, we can make the roots differ from $(A + D) by less 
than «. If, however, A =D, either B or C is zero or hoth are zero. We now 
take H=|C|+7, It follows that EG I” 
> 0, while (5.4) becomes 


s*— 2As + A? = }3[(B—C)*]/(| B| +.|C| +n), 


fe 


METRIC PROPERTIES OF DIFFERENTIAL EQUATIONS. 307 


which yields real roots differing from A by less than ¢, if » is sufficiently 


small. 

To prove the first statement, we note that the established second state- 
ment implies that for any most favorable metric, c—B=—4(A+D), a 
double root of (5.5). Thus, equations (5.4) and (5.5) must be identical 
equations, and thus we are led to the relation, BE —AF + DF —CG=0, 
to be satisfied by the coefficients of any most favorable metric. This relation 
can also be written in the form 


(5. 14) (A — D)?F? = (BE—CG)?. 

From (5.13), we have (on multiplication by FG) 

(5. 15) — (A— D)*EG = 4BCEG. 

Adding corresponding members of (5.14) and (5.15), we obtain 
— (A— D)?(EG — F?) = (BE+ CG)? Z0. 


Since HG — F? is to be positive, this relation is impossible unless A = D 
and BE —=—CG. From these two relations taken with (5.13), we find 
at once that we would have to have B= C = 0 as stated. 

We shall not dwell on the obvious connections between Theorems 6, 7, 
and 8 and the explicit solutions of (5.1) which are available when A, B, C, 
and D are constants. We merely remark that such considerations show 
that Theorems 1 and 2 are the best possible theorems of their type. 

We now consider the biquadratic forms B[A] and B*[A] = B— f2XiS,, 
whose significance is explained in 3. We see at once from (4.3) that, for 
any linear system (not merely second order system) and for any Riemann 
metric with constant coefficients, all the S; must vanish. Hence, our first 
result is to the effect that B[A] — B*[A]. 

In order to write down the expression for B[z] in reasonably compact 
and yet illuminating form, we introduce A*, B*, C*, D*, and H, defined as 


follows: 
A*B* EF ABN? , (AtB: 
H = EG — F?. 


We then find from (4.2) that 


(5.16) H[(—C# + (A — D) ay + By??? 
+ + 2Fiy + Gy?] + (B* + C*) ay + D*y’]. 


308 D. C. LEWIS. 


Hence, lower and upper bounds for B, for sets (#,y) satisfying f? = Ez? 

+ Fay + Gy? =1, can be written in the form Ho* + p or p, where p being 

an extreme value of the quadratic form, A*z? + (B* + C*)zy + D*y?, 

satisfies the quadratic equation, 

Hp? — [A*G + D*E — B*F — C*F |p + (A*D* — B*C*) — 4 (B* — C*)? 
= 0, 


while o, being an extreme value of the form, — Cz? + (A —D)ay + By’, 
satisfies the quadratic 


Ho? — [BE — AF + DF —CG]o + (AD— BC) —}(A+D)?=0. 


In view of Theorem 7 and (5.10), the occurrence of the form, 
+ Ci? = (A—D)azy = By’, in (5.16) is rather interesting. It seems 
worth while therefore to investigate what happens when, for a certain value 
of t, (5.10) may be assumed to hold. The result depends upon the following 
matric identity, which the reader will have no trouble in verifying: 


symm [ (5 4) >) 


Here the symbol symm M is used for the symmetric part of the matrix 
i.e. symm M=4(M-+M’). Using the definition of A*, B*, C*, D*, and 
H. we find from (5.16) that, when 


Ex? + 2Fiy + Gy = + C# + (D— A) ty = By? = 1, 
we necessarily have 
B=}(A+D)? + ([0Ar + 3(D—A)Q] 
+ — BC: + — A) (De + Ar) 
+ [4(D—A)B; — BD; = z, x]. 


This result is fully to be expected in case the coefficients A, B, C and D 
of (5.1) are constants. Thus, if (5.10) holds for 4 particular value of /, 
it holds for all ¢. Our result shows that the estimates given in Theorems 4 
and 5 are the best of their types. For we have exhibited an example in 
which the estimates give the exact values. 


fe 


or 


METRIC PROPERTIES OF DIFFERENTIAL EQUATIONS. 309 


6. Application to a simple non-linear system. Consider the second 


order system, 
dx/dt = — k(x? + 
dy/dt = — g — k(x? + 


where & and g are positive constants, with the Euclidean metric, ds? = dz? 
+ Thus taking gi: = 922 = 1, gis = 0, = 2, = y, A? — 4, 
the fundamental quadratic form Q may be computed from (4.1E). We thus 
find 


—(1/k) QA] = (x? + (2 + + (a? + 
+ (a? + (a? + 


If we set v = (2? y*)3, =y/v, we obtain 


— (1/kv) Q[A] = (1 + cos? 6)A? + (2 cos Osin 0)Apw + (1 + sin? 6) p? 
= A*+ + (Acos 6 + wsin 6)? 
=1-+ (Acos6+ pwsin 6)? for unit vectors (A, 


for which A? + p? = 1. Evidently, for such unit vectors, Schwarz’s inequality 
yields 
0S (Acos6+ pwsin 0)? =1. 
Therefore 
15 —1/(kv) Q[A] S 2, 


— Q[a] S — kv. 


We thus have an example of a system in which Q is negative definite at all 
points other than the origin. The biquadratic form B, however, seems too 
complicated to bother with. 


7. Applications to the qualitative integration of differential equations. 
We close our paper with a few indications of the way in which our ideas 
may prove to be of value in the study of functions defined by systems of 
ordinary differential equations. 


THEOREM 9. Suppose that the region R of Theorem 1 1s such that the 
X*[a,t] are defined for [x] in R and for tp St < + 0. Suppose also that 


; 


310 D. C. LEWIS. 


no solution curve which for t = ty is within R ever passes through a boundary 
point of R fort >t). Let A be the least upper bound of 


(7.1) fe, A)X*[a, t] + fz, A]X*2,[a, 


for [x] in R for every set of d’s such that f[a,rA] =1 and for t= to. 
Then, if A< 0, any two solutions, which for tZt, = ty are within R, 
must approach each other asymptotically. If R is finite, this phenomenon is 


uniform. 


Proof. In fact, from either Theorem 1 or Theorem 2, it is obvious that 
the distance between corresponding points on the two trajectories is less than 
or equal to the distance at ¢ —¢,, multiplied by exp [A(t—#,)]; (A <0). 
Hence it tends to zero as {—> 0. The uniformity results from the fact that 
the distance at {=¢, for any two trajectories is bounded, if R is finite. 
Further details are omitted. 


THEOREM 10. Suppose that, in addition to the hypothesis of Theorem 9, 
it is assumed that the X’s depend periodically on t with period + and that R 
is finite. Then there exists in R a unique periodic solution of period ;, 
toward which all other solutions entering R must approach asymptotically. 


Proof. Let =2,°(t), i=1,: -,m, be an arbitrary solution which 
for t=t, is within R. Then 2; —2;"(t) + mr), 1=—1, 
for each positive integral value of m, is also a solution. By the previous 
theorem, it is known that any two of these solutions approach each other 
uniformly. Hence, given a number « > 0 one can always find a number 7, 


independent of p, such that 


In particular | 2;°(mr) —a?(mr)| for integral m >T/r. .From the 
definition of 2;"(t) this inequality can be written 


| — <e for m>T/r. 
Hence the points [z”(0)] form a Cauchy sequence and we can write 


(7. 2) lim 2;"(0) = ai, i=1,---,n. 


The solution, 2; = 2;(t), such that 2;(0) =a; then turns out to be periodic. 
To establish this it is sufficient (on account of the uniqueness theorem for 


( 

Q 


METRIC PROPERTIES OF DIFFERENTIAL EQUATIONS. 311 


solutions of (1.1) taken together with the periodicity of the X’s) to show 
that zj(r) =a;. By the known continuity of solutions in respect to their 
dependence on the initial conditions, we know that (7.2) implies 


x,(¢) =lim 2;"(¢), 1,2,-° 2, 


uniformly over the finite interval O=¢=+r. In particular, 


= lim 2;"(r) = lim 2,™*"(0) = lim 2;"(0) = a, 
as we wished to prove. Finally, the periodic solution [a(¢) | must be unique. 
For, if there were two periodic solutions [a(¢)] and [Z(t)] it is clear, from 


the fact that | (¢) —%,(t)|? is continuous on 0StSr, that this 


quantity has (on 0 =? 7) a minimum value, which also serves (on account 
of periodicity) as the minimum value for all ¢. From Theorem 9, this 
minimum value must be 0. By the uniqueness theorem for differential 
equations, x;(¢) and z;(¢) would then have to coincide, and the theorem is 
established. 

The reader will notice that in the special case in which the X’s do not 
depend explicitly on ¢, the periodic solution, mentioned in Theorem 10, must 
be an equilibrium point (i.e. a point where all the X’s vanish), toward 
which all other solutions tend asymptotically. In fact, we can prove the 
following theorem for the non-existence of periodic solutions, which is some- 
what reminiscent of a result of Bendixson in the case n = 2.° 


THEOREM 11. Jf the X’s do not depend explicitly on t it 1s impossible 
to have a periodic solution (other than an equilibrium point) in any point 
set in which the expression (7.1) is either positive definite or negative 


definite. 


Proof. We consider only the case when (7.1) is negative definite. The 
other case reduces to this, if ¢ is replaced by —?t’. Suppose we had a given 
periodic solution [a(¢) ] with period + on which (7.1) were negative definite. 
Then, referring to Theorem 2, we take C, as the curve xj = 2;(t), C, as the 
curve x; = 2;(t-+ 7), and as the curve with one end point at [2x(to)] 
drawn around to the point [a(fo-+7)]. Since ai (to +7) =a; (to), we 
see that T;,, and hence I;, is closed. The length Z(t) is nothing other than 
the length of the given closed trajectory, which of course is a constant. Since 


5 Ivar Bendixson, “ Sur les courbes définies par des équations différentielles,” Acta 
Mathematica, vol. 24 (1901), pp. 1-88, especially p. 78. 


j 


312 D. C. LEWIS. 


(7.1) is negative definite, the B* of (2.17) may be taken less than a fixed 
negative number. Here we make use of the fact that the closed trajectory is 
compact, but we omit details. Hence, from Theorem 2, L(t) +0 as t> o. 
Hence L(t) =0. Thus the given periodic solution would reduce to an 
equilibrium point, and the theorem is proved. 

In conclusion, we point out certain obvious connections between the 
results of this paper and a paper by Wintner.® We refer, for example, to 
his theorem to the effect that, if A(¢) is a matrix of n? real continuous 
functions satisfying 
(7.3) lim sup fi max (yA(s)y)ds < 

0 


ly|=2 
then every solution vector of dx/dt = A(t)z is bounded as t—> o. 
Here the quadratic form yAy plays the same role as our Q, and, in fact, 
this theorem of Wintner results immediately, if we apply our Theorem 1 
or 2 to a linear system and a Euclidean metric. 


UNIVERSITY OF MARYLAND, COLLEGE PARK, MD. 
NAVAL ORDNANCE LABORATORY, WHITE OAK, MD. 


* Aurel Wintner, “ Asymptotic integration constants,” American Journal of Mathe- 
matics, vol. 68 (1946), pp. 553-559, especially p. 558. 


j 
V 
2 
f 
a 
1 
h 
se 
of 
If 
(1 
If 
ho 
W 
ipa 
| 


TIONS OF SOME LINEAR FUNCTIONAL EQUATIONS.* 
By N. G. De Brursn. 


in this paper is 


(1.1) + f(x) —f(t—1) =0, 


for 1 22 with the condition g(1) =f(1), and so on. 


sequently 0=2=1 and U=f(x%) =M. 


of period 1; this function has derivatives of all orders and we have ? 


v(x) of period 1 such that 
(1. 3) f™ (2) —y{a— +0 forz—>o, k=0,1,2,---. 
1 


If 0< «= +} the asymptotic behavior of the solutions is not known. 


paper. 


* Received September 15, 1948. 
explicit statement. 


«21, f” exists for 2, etc.; ef. Lemma 2. 


313 


5 


THE ASYMPTOTICALLY PERIODIC BEHAVIOR OF THE SOLU- 


1. Introduction. A typical example of the equations to be dealt with 


where a is a positive constant. Any continuous’? function f(z), arbitrarily 
given in the interval 0 = x <1, can be continued to a solution of (1.1) for 
t= 1 simply by solving the differential equation a“g’(x) + g(r) = f(x—1) 


It is easily seen that a function satisfying (1.1) for r21 which is 
continuous for z= 0 and bounded by M on 0 [211 is also bounded by M 
on0=2< Namely, let U be the least upper bound of f(z) n X 
and let 2 be the smallest number = 0 with f(z) =U. If a would exceed 
1, and so f’(%) 20, we could infer from (1.1) that f(t»—1) 2U and 
hence f(a—1) =U. This contradicts the minimum property of 2; con- 


We can prove more, e.g. that f(z) —f(x—1) — 0 for any solution of 
(1.1). If a> 1 we even find that f(z) tends to a periodic function y(z) 


(1. 2) f(z) —y (2) 0 for r> 0, 


If 0< a= 1 this is no longer true. If } << a1 we still have a function 


however, « = 0 it is comparatively simple: any solution tends to a constant. 
We shall deal with this last case, and with generalizations of it, in a separate 


1 Throughout the paper all variables and functions are supposed to be real without 


2If f(a) is continuous for «20 and satisfies (1.1) for «21, then f’(a) exists for 


314 N. G. DE BRUIJN. 


In the case of the equation (1.1) for «> 4, or more generally the 
homogeneous equations of class 8, (n = 2) below, we apparently have a linear 
operator Tf —y; its domain * consists of the functions f(a) which are con- 
tinuous on 0 = 21 and its range R consists of a set of periodic functions 
of period 1. This operator is continuous: if f,—> f uniformly on 0S 2X1, 
then we have uniformly Tf,— Tf. We shall prove furthermore that‘ R is 
dense in the space Ry of continuous periodic functions of period 1, that is to 
say, if a function x e Ry and a positive number « are arbitrarily given, then a 
solution f(x) of the functional equation can be found such that y= Tf 
satisfies | y(a) — x(x)| < for all values of x. 

The following uniqueness problem arises: Is the correspondence f—>y 
one to one? Or, what is the same thing, does f—>0 for x— o imply f=0? 
The answer is affirmative for a rather special class of equations (5), including 
the equation (1.1) for «>1, but the result probably holds true for wider 
classes. Thus far we know no counter examples, as far as equations (2.1) 
of class 8, (n22) with g(x) >0, r(x) =0 are concerned. For the 
equation of 6, Example 3, which actually gave rise to the present investiga- 
tion, the uniqueness theorem is also true, although the conditions of 5 are 
not satisfied. We do not prove it in this paper; it requires entirely different 
methods. 

Throughout the paper, in conditions and statements of the type “ f (z) 
is continuous for z =a,” f(a) must be interpreted as the right-hand k-th 


derivative. 


2. Functional equations of class %,. We want to study equations of 
the type w(x) + f(x) — f(«—1) =0, where w(x) is a positive func- 
tion. The methods used here compel us, however, to consider the more 
general non-homogeneous equation (2.1). 


We introduce an arbitrary decreasing positive function ®(z) for 721 


with convergent integral f @(x)dz. Let A be a positive number such that 
1 


A> (1), A> 
1 


%It is possible of course to extend our results to a wider domain, but such an f 
extension would not lead to more essential generalizations. If, for instance f(a) is J 
absolutely integrable over 0S a1 and if (1.1) is interpreted reasonably then f(a) is 
continuous for 1 S22. Our results are not essentially influenced by a shift e>a +1. F 


‘ The restriction g(x) > 0 has to be made here; see Theorem 6, Remark 1. 


| 
as 
|| 
or 
| 
de 
| abc 
litt 
t 


) is 3 
) is 


» t=0, derivable for x= 1 and if it satisfies (2.1) for 721. 


ASYMPTOTICALLY PERIODIC BEHAVIOR. 
If B is a positive and D a non-negative number, the equation 
(2.1) + p(w) —q(a)f(w@—1) =r(2) 


is called an equation of class %,(®, B, D) if the functions w, p, q, r are con- 
tinuous for «= 1 and if, furthermore 


(2. 2) w(t) >0 (#21) 
(2.3) p(t) >4 
(2. 4) | p(x) —q(x)| < B&(z), | r(a)| S (oe 1). 


The equation (2.1) is said to belong to class M,(®, B,D) if it furthermore 
satisfies the following additional conditions: the functions w, p, q, r have a 
continuous derivative for x = 1, a continuous second derivative for 7 = 2,--- 
and a continuous n-th derivative for z= n-+ 1, which satisfy 


? 


(2.5) |w® | < Ba, |p | < Ba, |q® | < Ba, |r) |< De 


for z= k +1, k—1,2,---,n. 

Evidently we have %) © %, 0 W%.0---+-O Mn. An equation is said to 
belong to the class Y,,(®) if, for any n = 0, numbers B, and D, can be found 
such that it belongs to %,(®, Bn, Dn). 

The class 8, (®, B, D) consists of all equations of class %,(®, B, D) which 
satisfy 


(2.6) {w(x)}* < B&(z) (221). 


Evidently (1.1) belongs to &,(a*") if a>0 and to %,(a8), where 
B= Min(2¢,a+1), ifa>4. 

Henceforth symbols (C,, C2,- - - will appear which have to be interpreted 
as follows. If such a © occurs in a formula or statement, then it can be 
given a positive value such that the formula or statement is valid; the choice 
of that value may depend on the function w(2) and on the values of A and B 
only, certainly not on 2, f(x), ®(x), p(x), q(x), r(v) and D. If, however, 
a variable is written down explicitly, e.g. C(u), then the choice may depend 
also on that variable. Analogously we shall use symbols y:, y2,° - - which 
depend on A and B only, and not on w(z). 

The proofs of the relations (1.2) or (1.3) do not require this convention 
about the C’s, but more explicit estimates are needed in 4; it requires only 
little extra trouble to derive them here. 

A function f(z) is said to be a solution of (2.1) if it is continuous for 


315 
e 
1, 
18 
0 
a 
f 
)? 
1g 
er 
) |) 
re 
nt 
t) 
th i 
of 
ore 
nat 
an 


316 N. G. DE BRUIJN. 


1. If (2.1) belongs to B,D) and if the function g(z) 
is continuous for 021, then a uniquely determined solution f(x) can 


be found with f(z) =g(z) (9S2751). 


Proof. Since w(x), p(x), r(z) and g(«—1) are continuous for 
the differential equation 


w(x) + p(2) f(z) = + 


has a solution in the interval 1 = with f(1) —g(1). This process 
can be continued. 


Lemma 2. If the integers k and n satisfy OSkSn, and if (2.1) 
belongs to Un(®, B,D), then any solution f(a) has k +1 continuous deriva- 
tives for z= k+1. 


Proof. We have 
(2.7) = {q(x)f(e@—1) + r(x) — f(@)}/w(@) 21). 


If f(z) has k (k =n) continuous derivatives for c= k then the right-hand 
side of (2.7%) has the same property for z=k-+1. Consequently f(z) 
has &+ 1 continuous derivatives for r=%k-+1. Our lemma follows by 


induction. 


Lemma 3. If the conditions of the previous lemma are satisfied, and tf 
(2. 8) (0<2S1), 
then we have, for k =0,1,---,n-+1 and any s>0, 
(2.9) | # (x)| (M+ D)-C,(s, (k<2Ss). 
Proof. First take k =0, and put 
(2. 10) q(2)f(2—1) + r(x) = (c= 1). 
The function y = f(z) solves the differential equation 
(2. 11) w(x)y’ + p(xz)y = 


and so we find, for r= 1, €=1, 


(2.12) f(x) exp {— p(t) /w(t) at} - 
+f $(t)/w(t) (— /w(u)au) 


ASYMPTOTICALLY PERIODIC BEHAVIOR. 317 


Take €=1, 1S since | o(t)| S(M+D)C, for 1StS2, we 
obtain 
|f(x)| S (M+ D)C, (1s2<2). 


Now taking = 2, 2= x33, we can apply the same process and we find 
an estimate for f on 223. Thus (2.9) can be proved for k —0. 

After that, (2.7) gives the result for =1. By differentiation of (2. 7) 
we derive the result for & = 2, ete. 


THEOREM 1. Jf (2.1) belongs to %,(®,B,D) and if a solution f(x) 
satisfies (2.8), then 


(2. 13) | f(2)|S (M+ (0S2z< a). 


Proof. Let B, (B, > 0) be such that p(x) > 4 > B,+1;B,=—=B 
suffices. Put 
M*(x) =Maxf(t); M(x) Max|f(t)|. 


If x=1 we either have M*(z} — M*(x—1) or there is a number 2 
(c—1S2% such that M*(x) =f (2), f’(%) 2&0. In the latter case 
we have, by (2. 1) 


P(Xo)f (Xo) S (to —1) + r(z). 
It results from (2.2) that, for r= B, + 2 
M* (x) = S M*(x—1) +2 | M*(x—1)| 1) + 2D6(x—1). 


This is, of course, also true in the first case. For Minf(¢) an analogous 
inequality can be found; it follows that 


S M(x—1)- {1 + + 


and so 


D+ M(x) S (D+ {14+ 2(B+1)o(ex—1)} B,42). 


Since @(1) < A, { <A, we have (0O=B, SB) 
1 


I] {1+ 2(B+1)®(Bi <u (yi >1), 


and hence 


(2.14) | | (M(B. +1) + Dy (c= B,). 


) 
nN 
38 
) 
l- 
) 
y 
f 


318 N. G. DE BRUIJN 


We may take B, = B. By Lemma 3 we obtain 
S D)C\(B+1,0) =(V¥+D)C; (OS 
and (2.13) directly follows, with Cy=—y,-(C;+ 1). 
If B, —0 we do not need Lemma 3 for the proof of Theorem 1. Then 
(2.14) expresses 


THEOREM la. If the conditions of Theorem 1 are satisfied and if 


p(x) >4 for x=1, then a number y;, depending on A and B only, evists 


such that 
| f(x)| S (M+ (e=0). 
Lemma 4. If the conditions of Lemma 3 are satisfied and if n=1, 
then for the function =f" (uw +k) is a solution of an 
equation 
(2.15) + pe (x) — que — 1) = 
which belongs to Un+( x, Bx, Dy), where 
= O(a +k), we(x) = Mw(a+k), Be=—Co(n), De = (M + D)C;(n). 
Proof. Wet N be a natural number and suppose that the lemma has been 
proved for n< N.® Then we have, by Lemma 3 and Theorem 1 
(2.16) |f™(r)|S (M+ D)C3(N) kik =0,1,---,N—1). 
Now suppose that n = N. ‘ 
It is evidently sufficient to consider the case k = 1; the cases k = 2, 3,: - 
immediately follow from this one by iteration. 
Since f(z) satisfies (2.1) we obtain by differentiation, for x = 2, 
w(x) + (p(x) + w'(x) 
=1'(x) — p'(x) f(x) + 
Now putting 
( 2{p(e+1) + 
(2.17) 1 +1) 


the function f,(z) =f’(«—1) satisfies (2.15), with k=—1, for x21. 
We have to prove that the analogues of (2.3), (2.4) and (2.5) hold. 
First we have, since | w’(a)| << B®(x) >0 for r> 


>} forx=Cy+1. 


5If N = 1 nothing is supposed here. 


th 


of 
Li 


| 
a 
tl 
| is 
| 
a 
T 


ASYMPTOTICALLY PERIODIC BEHAVIOR. 319 


We immediately find 
| w, (x)| < 2BG,(x), |qi(x)| < 2BG,(z) 
and hence, for z2/-+ 1, 
| {pi — < (x) + 2 | w (x + 1)| < (2) 
| (a) | < (x) (t= 1,---,N—1). 
Finally, by (2.17), (2.16), (2.4), and (2.5) we have, for 7=0,1,:--, 
2, 
| (x) | < 2{D + 2B(M + D) - 2'C,(N)}- (2). 
Now by taking 
Ce(N) = Max (C5, 6B), = 2 + 24 BC,(N) 
the proof of the case n = N, k = 1 is completed. 
We state separately the result (2. 16): 
THEOREM 2. Jf OS k=n, and if f(x) is a solution of an equation 
B,D) with | f(x)| SM then we have 
| (x) | S (M+ D)Cw(k) (zx = =0,1,- + -,2). 
We may write C,)(/) instead of the Cs(n) from (2.16) since %,(®, B, D} 
is contained in %1;,(®, B, D). 
In 4 we shall need 
Lemma 5. If f(a) satisfies the equations (2.1) of class %2(®, B, D) 
and if a number Q = 2 is given such that 
>3, + > 4 (t2=Q+1) 
then a number yz, depending on A and B only, can be found such that 
(z)| S (+ (k=0,1,2;72Q). 


Proof. For k=0 this follows by applying Theorem 1a to f(~—Q). 
This same theorem can be applied to f’(«—@Q) which satisfies an equation 
of class %,(®e, B:, D,), where By Di = (M+ (ef. the proof of 
Lemma 4). <A third application yields the result for k = 2. 


THEOREM 3. Under the conditions of Theorem 2 we have, if n=1, 
t= kh, 


(2.18) f® (x) — (@—1)| SCu(k) (w(x) + (M+ D). 


U 

n 

if 

ts 
1, 

n 
). 
n 
| 


320 N. G. DE BRUIJN. 


Proof. We have by (2.1) 
(2.19) p(x) {f(z) —f(«—1)} 
= {q(x) — p(x) }f(@— 1) + r(x) — w(2) 
Hence for x= B + 1 we have by (2.2), etc. and Theorem 2 
2| f(x) —f(e@—1)| SCi(n) {w(z) + ©(x)} (M+ D). 
We may write C,; instead of C,2.(n) ; cf. the remark after Theorem 2. This 


proves (2.18) for k—1. The cases k = 2,3,-- - are easily dealt with by 
successive differentiation of (2.19). 


3. Periodicity Theorem. If w(x) >0 for ro, Theorem 3 yields 
f(x) —f(«—1)—0. It by no means implies that f(x) tends to a periodic 
function. But this is easily proved if we suppose that 
(3. 1) | w(x)| < (e =1) 
and that f(z) satisfies an equation (2.1) of class %,(®,B,D). We then 


infer from (2.18) that 
—f(e+p) |S (M+ D) Cu JS “swat 


(v, 1, 2, 3,- 


Hence, by Cauchy’s theorem, lim f(z + rv) exists. 


We, however, drop this restriction (3.1) and replace ‘it by the weaker | 


one 
(3. 2) {w(zx)}? < B&(z), (rx21); 


that is, our equation belongs to a class Bn. 


Lemma 6. If f(x) satisfies an equation (2.1) of class B2(, B, D) and | 
if (2.8) holds, then there exists a uniquely determined continuous periodic | 


function of period 1 such that fortz= C2, 


(3.8) f(2) S (+ D) | 
So it 


Proof. By Theorem 2, f’(x) is uniformly bounded for z= 2. 
follows from (3.2) that, for B+ 2 


| + w(x) /p(x)} — f(x) — w(2)/p(2) f(a) | S (M+ D)C(2), 
where Cig = 2BC,o(2). Consequently, by (2.1) 
| f(a + w(x)/p(x)} — f(e—1)| S (p@)}*- — p@)}F(@ — 1) +44 


+ (M+ (2) S (M + 


TH 


(3. 


f 
k 
80 


T 


it 


ASYMPTOTICALLY PERIODIC BEHAVIOR. 321 
For t > Cig the function 
y= f w(t) /p(t) dt 
B+1 


is steadily increasing, and dy/dr—>1. We put Take 
> Cis +1 and put z*=€(yo), so that 


(3.5) 
Since 
|{w(x)/p(x)}’ | < (t2=B+1), 
we easily infer from (3.5) that 
(3.6) | 2* — w(29)/p (ao) | < (29 — 1) (t% = Cn +1). 


The function f’(x) is also uniformly bounded, and so by (3.6) and (3. 4) 
| HE(yo) — HE(Yo—1)}| =| f(2*) — f(t —1)| 
= | f {to + w(Xo)/p(ro)} — f(t. — 1)| + (M + D)Cy0(1) CooP(a — 1) 
S (M+ D)C2.8(2)—1). 


Since f < we find that 
1 
— lim + 
exists if y runs through the natural numbers, and (3.3) immediately follows. 


Evidently y(y) has the period 1. Moreover, from (3.3) and from the fact 
that f’(z) is bounded, it is easily deduced that 


(3.7) | S (M+ | — | 
for any pair of values y;, y2; hence w(y) is continuous. 


THEOREM 4. If f(a) is a solution of the equation (2.1) of class 
Bnio(’, B,D), (n=O) then there exists a uniquely determined periodic 
function y(y), with period 1 and with n successive derivatives, such that for 


(3.8) | f w(t)/p(t)at)| S (M+ 


Proof. Lemma 4 shows that f* (a -+ &) satisfies an equation of class B., 
so that, by Lemma 5, continuous functions Wo, ¥i,° * *, Yn exist such that for 
t=k+ Cxu(k), 


(8.9) | f | S (M+ D)Ora(k) dt 


| 
| 
| 


322 N. G. DE BRUIJN. 


Here we have p,(t) = p(t) + kw’(t) (cf. the proof of Lemma 4), and 
is chosen such that p(t), p.(t),- pn(t) are all >4 for Since 
for 


(cf. (3.2)) we have, in virtue of (3.7), for k=0,1,---,n, 


(3. 10) | f(x) — dt)| = (M+ D)C,;(k) dt, 
B+ 
where 


X 


By integration of (3.10) we obtain, using the facts that the ¢,;’s are con- 
tinuous and that for r> 


B 
de = — de (2) 1,2,- + 
for any and It results that 


4. The correspondence f—¥y; the completeness theorem. We consider 
an equation (2.1) of the type 8, (n=2). Any continuous function f(z) 
arbitrarily given for 0 =z < 1 gives rise to a continuous function y(2) of 
period 1 in virtue of Lemma 6. We shall investigate this correspondence. 
Since it is linear we may restrict ourselves to the homogeneous case r(x) =0. 
Then the correspondence f — y defines a linear operator y = Tf. The domain 
is the set D of all continuous functions on 0 = 21, or, since any such 
function can be continued to a solution, the set of all solutions of our equation. 
The range R is the set of all y—=Tf, fe D. By Ry we denote the set of all 
continuous functions of period 1. Using the metric: 


(4.1) | =Max|y(z)], 


we have 


THEOREM 5. The operator T is continuous in D: 
(4. 2) ITF SCs (feD). 


Proof. This immediately follows from (2.13) (where D=0 since our 
equation is homogeneous) and from Lemma 6. 


ASYMPTOTICALLY PERIODIC BEHAVIOR. 323 


THEOREM 6. (Completeness Theorem). If q(x) >0 forx=—1, thenR 
is dense in Ro, that is to say, if xe R and e>0 are given, then a function 
feD can be found such that || y—Tf || <«. 


Remarks. 1. The condition q(x) > 0 for z= 1 is essential. If we had, 
for instance, =0 for 1S then any pair of solutions f, and 
with f;(1) = f2(1) would coincide for = 1, so that R would consist of ail 
multiples of a single function. 


2. Theorem 6 is also true if an equation of class %{, is concerned, with 
the supplementary condition (3.1); only very little alterations in the proof 
below will be required. 


Lemma 7%. Let the equation (2.1) be of class Mo, with q(x) > 0 (a4 21). 
For Q = 0 denote the set of continuous functions on QS xZQ+1 by Dy. 
Let for0OS RSQ the operator Tre be defined as follows: if ge De,he Do, 
and if the function f(a) satisfies (2.1) for x = R+-1, f continuous for «= R, 
f=g f=Hh (QSrSQ+1) then we put h=Trog. 
Then TraDr is dense ® in Do. 


Proof. It results from (2.12), with €=R-+1, r(x) =0, that the 
operator Trg is continuous. Furthermore, for P= R= Q, we have TreT pr 
=Tpg. Thus, if TprDp is dense in Dr and TreDr is dense in De, then 
TpeDp is dense in Dg. Consequently it suffices to prove our lemma for the 
cae R= Q=R-+1; the general case follows by iteration of this one. 

Lett RSQSR+4+1,€>0 and he Dg. Construct a function h, De 
with a continuous first derivative, such that 
(4. 3) w(ax)h’,(2) + p(a)hi (x2) — q(@)hi(x—1) =0 
holds for Q +1, and ||h, —h|| <«. Since g(x) >0 we can continue 
backwards over R= x= Q such that it satisfies (4.3) for 
=Q-+41 and such that it is continuous for RSEex¢=Q+1. 

If ge Dr is defined by (RSxSR-+1), then we have 
Treg =h,, and so || Treg —h || < «, which proves the lemma. 


Proof of Theorem 6. Let My be the largest of the numbers || x ||, || x’ || 
and ||” ||. Let «, (0 <«,<1) be a number which is to be fixed later on. 
Let the number Q > B+ 3 be such that for r= Q@—1 we have 


p(t) >4, p(x) + 2w’(x) > 
) < 1) <4, f dt < 4; 


* With the metric defined analogous to (4.1). 


r 
e 
| 
T 
) 
] 


324 N. G. DE BRUIJN. 


and such that 
Q 

(4.5) w(t) /p(t) dt 
B+1 


is an integer. It follows from Lemma 7, with R = Q — 2, that a continuous 

function f,(z) can be found in 9 —2[x=@Q-+1 which solves (2.1) for 

such that 

(4. 6) | —x(@—Q)| <4 (QSr7SQ+1). 

Again by Lemma 7, with R = 0, a solution f(z) of (2.1) can be found such 

that 

(4.7) |f(2) p(2){w(2)} 
(Q—2=259—1). 

Putting f(z) —fi(v) =f2(x) we have 

f'2(%) = — 1) — p(x) f2(x) }/w(a) (Q—1S7=0+1), 


and a similar expression for f’, (rx2=@Q). From the inequalities (2.4), 
(2.5), (4.4), and from p(z) > 4 we now easily infer that 


(4.8). | —f,(2)| < yes 


ys depending on A and B only. Consequently, by Lemma 5, 
(4.9) 0,1,2;229Q). 
We can now derive by the process of Lemma 6° 

| le + /p(e)} — f(@—1)| S yo (t=Q+1) 
and, if Tf —y, 
(4.10) | f(z) —v(e— BQ). 


It follows from (4.9) and from Theorem 4 (k —1) that || y’ = M,. So 
(4.10) and (4.5) lead to 


| f(z) —¥(@—Q4+ N)| + 
and from (4.6), (4.8) we infer 
+1 + (+97) (Mo t+ + 1) y2} = erys(x). 


Now take «, such the right-hand side equals ¢; then it results that || x—y| 
=Ix—Tfll<« 


7 We can takeC.,; = Cx = Q, and then for «= Q all other C’s can be replaced by 73. 


| 
t 


ASYMPTOTICALLY PERIODIC BEHAVIOR. 325 


5. The uniqueness problem. We consider the equation 
(5.1) w(x)f' (x) + p(x) f(x) —f(e—1) =0. 


Let B and p be positive constants, p>1, and suppose that for c21 
(n=0,1,2,:--), the functions w™(z) and p(x) are continuous and 


satisfy 
(5.2) |w™(r)| <B —1}™| < Bente? = 1). 


Under these conditions we can prove 


THEOREM 7. If f(x) 1s a solution of (5.1) and tf lim f(r) —0, then 
we have f(x) =0. ne 


Remarks. 1. The condition w(x) >0 (x21) is not needed here. If 
it is valid, however, then the equation (5.1) belongs to B,(2) and gives 
rise to a one-to-one correspondence f <> y. 


2. With very little alterations in the proof below we can show that 
Theorem 6 remains true if the right-hand sides of (5.2) are replaced by 


where the numbers B, a, 8, p satisfy 
B>0, p>1, p> p>a—B+1. 

For instance, the function = exp (—e*) satisfies this condition, with 
a= 3, B and p arbitrary; it does not satisfy (5. 2).® 

Lemma 8. Jf a=0, if f(x) is a solution of (5.1), and if g(x) satisfies 
the “adjoint equation” 
(5. 3) —p(z)g(e@—1) + 9(z) = 0 


forx=a+1, g(x) continuous for =a, then the expression 


(5.4) = FO g (at + w(x) f(a) g(@—1) 


is independent of x forxZ2a+1. 


This directly follows by differentiation of (5.4) with respect to 2. 

SCondition (5.2) implies that w(#) and p(w) can be continued analytically 
throughout a region | arga| <3, |#| >1—é6, (5,.>0) and that w(#) = O(a), 
p(w) —1=O(a-»), in that region. Conversely, from these facts we can deduce that 
(5.2) is valid for some B> 0. 


8 


326 N. G. DE BRUIJN. 


Lemma 9. (Theorem of Carleman-Mandelbrojt*). Ifa<b and >1, 
then a function g(x) exists fora b, differentiable infinitely often, such 
that for n=0,1,2,-- - 


(| go™ (2) | (as 2s b;0° = 1), (a) = go) (b) = 0, 


CO) 20 >0. 


Proof of Theorem 7%. Suppose that f(x) is not identically zero, then 
numbers a and b (0Sa<6bSa-+1) can be found such that f(r) 40 
(aSvsb). Take l<yn<p. Let go(x) be the function of Lemma 9, put 
9o(x) =0 for b= x Sa-+1, and let gn(x) be defined recursively by 


(5. 6) = (1—Qn) gna(2), 


where the operator Q, is defined by 


(5. 7) Qnd = An + OY, 

(5.8) wale) w(x +n). 
Furthermore we define g(x) for 2a by | 

(5.9) g(a@+n) 


This g(x) clearly satisfies (5.3) for z=a-+ 1; we shall prove that g(z) is 
bounded for =a. 

First we estimate the expression 0%,Qx,,,° for lS hi << < ks, 
a@a=xSa+1. We write it as a polynomial in the @’s, the w’s, go and their 
derivatives, and we then observe, by induction, that all its terms are contained 
in the expansion of 


(5. 10) (1 + d/dx)*{ (ex, + we,) (%, + Wes) Jo}- 
We can write this in the form 


where p» denotes the differential operator working on gy only, pw; the one 
working on a; and w; only. We now replace the @’s, w’s, go and their 


®See S. Mandelbrojt, Séries de Fourier et classes quasi-analytiques de fonctions, 
Paris, 1935, p. 69; the above lemma is a direct application. Mandelbrojt’s contribution, 
viz., the fact that g,(a#) can be chosen 20, is essential in the sequel. 

We do not actually need the Carleman-Mandelbrojt theorem here since it can 
be verified that g.(x) = exp (a — a)1/(y-1) — (b — #)1/(m-1)} (ax<a<b), g(a) 
= g.(b) =0 satisfies the conditions of the above lemma. 


18 


id 


cle 


| 
T 
i 
t 


ASYMPTOTICALLY PERIODIC BEHAVIOR. 327 


derivatives by the majorants given by (5.2), (5.5) and we even replace m” 
for m<s by s”. We infer that the sum of the absolute values of the terms 
in (5.10) is less than 


U = Bi(k,- (1 + 87+ 28/k, +: 28/ks)8, 
and a fortiori we obtain 
~ ) 
0 
A,, A2,* + - denote positive constants depending on B, p and 7 only. 
From (5.6) and (5.11) it results that 
| gn(x)| = |(1—Qn) go | 
n n 
=B fe II (1 + dt B f (1 + 
0 k=1 k=1 


where r= 4(p+7). Finally we obtain 


| gn(x)| SB Jf t + > dt. 


The integral on the right is convergent (since p > 7, 7 <7) and its value 
is independent of n. Consequently | g(x)| < Ag for x2a. 

The proof of Theorem 7 now easily follows from Lemma 8. From 
f(x) +0, | g(x)| < A, for oo we obtain by (5.4) that (f,g) =0. On 
the other hand, by taking x a+ 1 in (5.4), we find 


(f,9) = g(t) dt. 


But since f(z) is continuous and ~0 for a<zZb, g(r) 20, g(x) not 
identically 0, we find (f,g) 40. This is contradictory; it follows that f =0. 


6. Applications. We shall apply the contents of this paper to some 
equations of the form 


(6.1) G’(r) = K(x)G(x— 1), 


where K(x) is a given function, positive and continuous for «= 1. 


If G(x) is a solution of (6.1) and G(x) >0 for 021, then 
clearly > 0 for Moreover, if G(2) is a second solution, then 
G(x)/Go(x) is bounded for rz=0. Namely, if M is the maximum of 


328 N. G. DE BRUIJN. 


| G(x)/Go(x)| for 0 = 2 <1, then it is easily seen by step-by-step integration 
that | G(x)| S MG,(x) for r=0. 
If we put G(x) = Go(x) f(x), the function f(x) satisfies 


(6.2) (Go(x)/G'o(x)) + f(x) —f(~—1) = 9, 
which belongs to class Yo. 
Example 1. The equation 
(6. 3) G’ (x) = 2re?1G(a—1). 
A special solution is Go(x) exp 2’, and so (6.2) becomes 
+ — f(x—1) =0. 


This equation belongs to class 8, with @(2) —«2*. Hence any solution of 
(6.3) gives rise to a periodic function y(x) of period (1), such that 


G(x) = (a — flog x) + O(a}. 
The completeness theorem is valid, but the uniqueness problem is still unsolved. 
Example 2. The equation 
(6. 4) G’ (x) = (exp z7)G(x—1). 


This example is less artificial than the previous one; it is not easy to give 
an explicit solution G (xz). Nevertheless we can try to find a function 
Go* (x) = such that the substitution G(x) = Go*(x)f(x) leads to an 
equation of a class 8, (n2=2). We find, if H’(z) =h(z), 
(6.5) /h(z) + f(z) 

1 

— exp {z? — f h(x-— t) dt — log h(x) }f(«—1) =0, 
0 


and hence we have to find a function h(a) which makes the absolute value of 


(6.6) — f h(a) 


less than a function @(z) with convergent integral. The following iteration 
process gives a result in a finite number of steps in this and in many other 


cases. Put 


ho(z) =1, = 27, Anis = + Ans (2) 


Ans (2) = An(2) — t) dt — log 


a 


or 


hg 
Ne 
(6 
80 
to 

q( 
ant 
fur 
= 
the 
divi 
y 
(6. | 
mati 
Nede 
=In 


ASYMPTOTICALLY PERIODIC BEHAVIOR. 329 


so that An,i (2) is the result of substituting h(x) =hn(zx) into (6.6). Since 
we shall find w(x) = {h(x)}-* ~ 2? we cannot apply our theory with a function 
® of order less than 2~* (cf. (2.5)), therefore we stop our iteration. at the 
first A, which is O(27*). 
We obtain 

A;(z) = 4 — + 2x7 + 2x7 log — 2x log x + 

= — log x + O(a) 

A;(x) = 
h(t) = 2? + 24+ 4— 221+ 2 loge + 227 log x — 423 log x + O(a). 
Now take 
(6.7) H(x) = + + Ba — 3271 — 22 log x — 

— 2 log 2a log + 2a log 2, 


so that H’(z) = h(x) = + O(a*). Now the equation (6.5) belongs 
to B.(a-*) ; the required inequalities (2.5) for the derivatives of w(x) and 
q(x) easily follow from the fact that these functions are power series in 2! 
and x log a. It results that to any solution of (6.4) there belongs a periodic 
function y(a), of period 1, having derivatives of all orders, with 


G(x) = + + O(a*)}, 
or 


G(x) = {y (x) + O(2*)}. 


H(z) is given by (6.7). The completeness theorem holds; the uniqueness 
theorem can be proved by applying Theorem 7 to the equation (6.5) ; first 
divide it by the coefficient of f(~—1). 


Example 3. Mahler’s partition problem *° gives rise to the functional 
equation F’(y) = F(y/r), where r is a constant >1. We transform it by 
y=r?, F(r*) = into 


(6.8) G’ (x) = e%*8G (a —1) (2>0), 


20K. Mahler, “On a special functional equation,” Journal of the London Mathe- 
matical Society, vol. 15 (1940), pp. 115-123. 

N. G. de Bruijn, “On Mahler’s partition problem,” Proceedings of the Koninklijke 
Nederlandsche Akademie van Wetenschappen, Amsterdam, vol. 51 (1948), pp. 659-669 
= Indagationes Mathematicae, vol. 10 (1948), pp. 210-220. 


6 


\ 
| 


330 N. G. DE BRUIJN. 


where especially = log r, B= log logr. Dealing. with (6.8) by the method 
of the preceding example, we find 
ho(z) =1, h,(x) =ar+ 8 
= $a — log a— log — Baa" + 
log a) + log « 
+ a?(— B—4a-+ log a)a log x + log? + 
A,(@) = — fa log a*r™ log + O(2?) 
A;(x) == O(x*) 
(HH (x) $a(a—alogz)? + (1 +8 + $a—log a)e 
(6. 9) + (—1+ a" log «— B/a) log — $a-*x" log’ x 
+ + B— log «)a* log a. 
To every solution of (6.8) there corresponds a function y(y) of period 1, 
with derivatives of all orders, such that 14 


(6. 10) G(x) = e#@ {y (a — log x/a + log + O(a")}; 


the rather complicated function H(z) is given by (6.9). Section 4 applies 
to this case, section 5 does not. Nevertheless the uniqueness theorem is true; 
this will be shown in a separate paper dedicated specially to (6.8). There 
we shall express y(x) explicitly in terms of the values of G(x) on 0 S27 Sl, 
and vice versa. 

From the results to be obtained in that paper it follows that (6.10) is 
the best possible result of its kind: it is not true that functions H,(2) 
= O{H(zx)} and n(x) — o exist such that every solution of (6.8) is of the 


form 


G(x) = + 


W(y) periodic mod 1. 


MATHEMATISCH INSTITUUT DER TECHNISCHE HOGESCHOOL, 
DELFT, NETHERLANDS. 


11 We take = 


l 

( 
a 
( 
w 
(: 

| 

in 
eg 
(: 
(6 
wh 


ON THE CLASSICAL EXISTENCE THEOREM OF LINEAR 
DIFFERENTIAL EQUATIONS.* 


By AvuREL WINTNER. 


The purpose of the following considerations is, first to formulate, and 
then to prove, the classical representation theorem of homogeneous, linear 
systems of ordinary differential equations so as to apply to absolutely conver- 
gent Dirichlet series or Laplace integrals. 


THEOREM. On the half-line 
(1) ist< a, 


let a(t), B(t) denote n by n matrices of functions each of which is of bounded 
variation on (1), and put 


(2) f -*tda(t) 
and 

(3) B(s) e*dp(t), 
where 

(4) o. 


There belongs to every « a B having the following property: n linearly 
independent solution vectors x(s) of the system of n linear differential 
equations 


(5) = A(s)z2, 


where x’ = dx/ds, are supplied by the n columns of the matrix 


(6) X(s) =E + B(s), 
where 
(7) E is the unit matrix. 


It should be emphasized that the absolute convergence of the unknown 


* Received September 3, 1948. 


foe) 

e 
$ 
) 
e 

331 


332 AUREL WINTNER 


matrix integral, (3), is claimed on the entire half-line, (4), on which the 
absolute convergence of the given matrix integral, (2), is assumed. 

It will be clear from the proof (cf. (27) below) that the assertion of 
the theorem can be amplified by the following 


Remark. Under the assumptions of the Theorem, the spectrum of B is 
contained in any closed ray containing the spectrum of a. 


It is understood that the spectrum of a is defined to be the set of those 
points ¢, of the half-line (1) corresponding to which there does not exist an 
«> 0 satisfying «(¢) = const. for | t —t) | <«, and that a ray, R, of a set, 
S, is meant to be a set containing S and having the property the sum of any 
two (not necessarily distinct) values contained in RF is in R. 

In order to illustrate the content of the Theorem (and of the Remark), 
suppose that each of the n? elements of the coefficient matrix A(s) of a 
system (5) is a Dirichlet series without constant term, say 


m=1 

that each of these Dirichlet series is absolutely convergent on (4) ; finally, 
that (by the insertion of vanishing coefficients am) the notation is so chosen 
that there belongs to every h and every j (</) an m satisfying An + Aj = Am. 
It then follows that every component 2; —2;(s) of every solution vector 
of (5) is of the form a(s) —c,+di(s), where 
= are integration constants, which can be 
assigned arbitrarily, and d,(s),---,dn(s) are Dirichlet series having the 
same exponents Am as (8) and converging on the entire half-line (4). 

The particular case Am = m is classical. For, if the n? elements of A(s) 
belong to the case Am =m of (8), the substitution e* =z reduces (5) to 


(10) dw/dz = P(z)w, 

where each of the n? elements of P(z) is a regular power series, 

(11) = bmz™ 
m=0 


(where 6,40 is allowed). Hence, what results is the classical result, 
according to which the regularity of the coefficient matrix (11) in a circle 
|z| <r implies the regularity of every component of every 
solution w= (w,,- --,Wn) of (10) in the entire circle.* 


1 What actually follows is somewhat more, namely, the absolute convergence of the 
power series of every function w,(z) on the boundary | 2 | =~,, if the power series (11), 


I 
( 
( 
( 
H 
( 
: 


), 


LINEAR DIFFERENTIAL EQUATIONS. 333 


The classical proofs of the latter theorem fail to apply in the case of 
arbitrary Dirichlet series, Am 34m, mainly because the regularity of the 
functions ceases to imply the convergence (not to say the absolute convergence) 
of the Dirichlet expansions. In fact, this difficulty arises even in the case 
of ordinary Dirichlet series, 


(12) d(s) = 3 an/m*, 


where, in (8), 
(13) Am = log (m + 1). 


It is of no avail that (12) cannot have a half-plane of convergence without 
having some half-plane of absolute convergence. 
The proof of the Theorem proceeds as follows: 


Under the assumptions placed on the coefficient matrix, A(s), of (5) 
in the Theorem, put 


(14) [a] = max ([11], [a12],- - [onn]), 


where, if a4.(¢) denotes the (i, /)-th element of the matrix a(t), 


(15) [au] | dav(t)|. 


Then the assumption placed on (2) is 
(16) [a] < o. 


It can be assumed that 
(17) a(1) =0 


(0 = zero matrix), since (2) remains unaltered if a constant matrix is 
added to the matrix a(f). 

Let X be an n-ary square matrix the columns of which are solution 
vectors of (5). Then (5) is equivalent to 


(18) X’(s) =A(s)X(s). 
Hence, if B(s) is defined by (6) and (7), 
(19) B’(s) = A(s) + A(s)B(s). 


representing the coefficient functions of (11) when |z| <r, are supposed to converge 
absolutely on | z| =r. 
This refinement of the classical theorem can readily be verified directly. 


) 

| 
: 1 

7 

> 

e 

y 


334 AUREL WINTNER. 


Since B(s) is required to become of the form (3), it is required that 
(20) B(o) =0. 


Hence, the successive approximations B,(s), B.(s),- + +, which should lead, 
in the form 


(21) B(s) =3 Bn(s), 


m=1 


to the desired solution, (3), of (19), must be set up as follows: 


(22) (8) = A(s)Bn(s), 
where m=—1,2,--- and 
(23) B,’(s) =A(s). 


In order to obtain every term of (21) in a form corresponding to (3), try 


fo 
(24) Bm(s) f e**dBm(t), 
1 
where m 1,2,---. The unknown matrices 8,,(¢) occurring in (24) can 
be normalized by 


cf. (17). 

According to the case m1 of (24), the initial requirement, (23), 
of the recursion formula, (22), is satisfied if 8, is defined (at its continuity 
points) by (25) and 


(26) — tdB,(t) = da(t). 
For B2, 83,° * *, substitution of (24) into (22) supplies the recursion formula 
(27) — tdBmii(t) = d{a(t) * B,(t)}, 


where the asterisk is the symbol of convolutions (of matrices; so that A *p 
is not identical with »*A). Needless to say, only Bn(t+ 0) or Bn(t— 09) 
is relevant in (26), (27). 

It is readily seen from (15), and from the definition of a convolution 
as an integral, that, if Au,(¢) and pjx(t) are scalar functions, 


[Aix * S [Ace] Loic]. 


spe 
con 


( 
t 
g 
tl 
hi 
hi 
mt 
Wi 
af 
of 
rey 
ia 


LINEAR DIFFERENTIAL EQUATIONS. 335 


Hence, if A(t) and w(t) are n-rowed square matrices, it follows from (14) 
and (15) that 


(28) 


the factor n being introduced by the summations which occur in a matrix 


product. On the other hand, ¢=1, by (1), and so, by (26), 
(29) S [¢]. 


For the same reason, 


S[a* Bm], 
by (27). Consequently, an induction and (29) show that 
(30) [Bm] S (n[a])”. 


It follows from (29), (30) and (16) that (26) and (2%) define on the 
half-line (1) matrix functions each of which has a finite total variation on 
(1). Thus, each of the integrals (24) is defined, and converges absolutely, 
on the half-line (4). 

On the other hand, (29) and (30) fail to lead, via (24) and (21), 
to an integral representation (3) in which, as claimed by the Theorem, 
[B] < co. In fact, all that follows in this manner is the absolute conver- 
gence of (3) on some half-line ss < ©, with a sufficiently large s), rather 
than on the given half-line (4), where s)=0. Correspondingly, what is 
thus supplied by (30) is of the type of the local existence theorem of non- 
linear analytic differential equations. 

Thus, in order to prove the Theorem, the linear character of (5) will 
have to be used more fully than above. This will be accomplished by proving 
that (30) can be refined to 


(31) [Bm] S (n[a])"/m!}, 
where m =1,2,:--. 


To this end, use will be made of the notion of the spectrum, as defined 
after the italicized Remark following the Theorem. It is clear from this 
notion, and from the integral defining a convolution A *p of two matrices 
w=p(t), A=A(zt) on (1), that a ¢-value, say cannot be in the spectrum 
of A* unless ¢) is in the closure of the set of those numbers which are 
representable in the form ¢,-+¢:, where ¢,, f, are points contained in the 
spectra of A, » respectively. This implies that, if the spectra of A, pw are 
contained in the respective half-lines aS t< 0, b= t< ow, where a= 0, 


| 

4a 


336 AUREL WINTNER. 


b = 0, then the spectrum of A * » is contained in the half-linea+bSt< o, 
Since all spectra are confined to the half-line (1), it now follows from 
(26) and (27), by choosing 


A =4, p= Bm, a=1,b=m 


and applying an induction, that the spectrum of Bm is contained in the 


half-line m St < 0, where m=1,2,---. In view of (25) and of the 
definition of a spectrum, this means that 
(32) Bm(t) =0 for lSt<m (if m>1). 
It is seen from (32), (27) and (1) that 
(33) (m +1) [Bm] S [%* Bm] ; 
cf. (14), (15). But (33) and (28) show that 
(34) S n[a][Bm]/(m + 1). 


If an induction is applied on (34), it is seen from (29) that the proof of 
(31) is now complete. | 

Let c= n[a]. Then, since n is the fixed dimension number, 0c < 
by (16). Hence, from (31), 


co 
Bm = Sc"™/m!< 
m=1 m=1 


It follows therefore from (14), (15) and (25) that the series 


B(t) = Bn(t) 


m=1 


defines on the half-line (1) a function B(¢) satisfying 
(and =0). 


It is also seen that this B(t) and the assignment (3) define on the half-line 
(4) a function B(s), and that the latter is representable on the half-line (4) 
in the form (21). In fact, the convergence of the series (21) at every s=0 
and the legitimacy of the term-by-term integration leading from (3) and 
(24) to (21) are clear from the last three formula lines. 

Finally, the functions Bn(s) occurring in (21) -have been defined, after 
(21), by the method of successive approximations, which belong to the initial 


condition (20). Hence it is clear, from the standard argument in successive | 


approximations, that the matrix X(s) which is defined on the half-line (4) 
by (6) and (21) is a solution of (18). That the assignment (20) is actually 


ea 


b 
i 
b 
by 
t 


LINEAR DIFFERENTIAL EQUATIONS. 337% 


satisfied by the B(s) constructed, is seen by using for B(s) its repre- 
sentation (3). In fact, since [8] < o and t=1>0 in (3), the integral 
(3) tends exponentially to 0, as s—> . 

In order to complete the proof of the Theorem, all that remains to be 
ascertained is that the columns of the matrix (6) are linearly independent, 
j.e., that det X(s) 40 for every s on (4). Since (18) is known to imply 
that 


8 
det X(s) det X(s) exp f trace A(t) dt, 


80 


it is sufficient to show that det Y(s) 40 for some s on (4). But this is 
clear from (20) and (7). 


APPENDIX. 


The Theorem, thus proved, has extensions in two obvious directions. 

On the one hand, the lower end-point, {= 1, of the half-line (1) can 
be replaced by any positive number, {=p>0. For, if the lower limit of 
integration in (2) and (3) is any {=p > 0, it can be reduced to by 
a change, ¢-—> pt, of the integration variable. It is true that the e** in 
both definite integrals then becomes replaced by e-8?*. But this reduces to 
e*t by the substitution s—>s/p, which is admissible, since it transforms the 
half-line (4) into itself. 

On the other hand, it is clear from the absolute convergence of both 
integrals (2), (3) on the closed half-line (4) that the latter can be replaced 
by the half-plane (oc = 0, — 0 <¢t << o) of a complex variable, s =o + it. 
In particular, since this half-plane is closed, the Theorem is applicable on 
the boundary line, o—0, where s = it. 

These two obvious extensions of the Theorem contain (and are substan- 
tially equivalent to) the following statement: 


Let A(t), where — wo <t < o, bea matrix function of n by n elements 
each of which is representable in the form 


co 
(j,k 


D 


where p is a positive number and each of the n? scalar functions a4;(A) has 


| 
a 


338 AUREL WINTNER. 


a finite total variation on the half-line p=rX< w. Then every vector x(t) 
representing a solution of dx/dt = A(t)ax ts of the form 


x(t) —c+f e*dy(A), 


where the vector c is an integration constant, which can be assigned arbi- 
trarily, and each of the n components of the vector y(A) =ye(A) has a finite 
total variation on the half-line pSA< ow. 

If all n functions «jx,(A) are chosen to be step- stiveitianan (which can 
have clustering points of discontinuity), there results the following corollary: 


Let A(t), where — «0 <t< w, bea matrix function of n by n elements 
each of which is uniformly almost periodic. Suppose that 


(i) the frequencies, Am, occurring in the Fourier expansion, 


A(t) ( > etrmt) 


m 
of A(t) have a positive lower bound, (or a negative upper bound), and that 
(ii) these n* Fourier expansions are absolutely convergent, 
n n 
j=1 k=1 m 
Then every vector x(t) representing a solution of dx/dt = A(t) 


(1) ts uniformly almost periodic, with a mean-value, 


= lim (v—u)> f 


which can be assigned as an arbitrary integration constant, c= M(x), deter- 


mining x(t), and 


(II) the Fourier expansion of each of the n components of every x(t) 
is absolutely convergent. 

In addition, every ray containing the frequencies Am contains every non- 
vanishing frequency of each of the n components of every solution x(t). 


The last assertion, in which the “ray” is meant to be generated by 
positive integral coefficients, corresponds to the Remark following the Theorem. 

It would be interesting to know whether the last italicized theorem 
remains true if its assumption (ii) and its assertion (II) are omitted. 


THE JOHNS HOPKINS UNIVERSITY. 


p 
8 
2 
d 
0 
W 
P 
be 
af 
]- 
co 
of 
en 
Le 
wl 
in 
v of 
bu 
dif 
De 
A 
bo 
gen 
nei 
not 


ON 1-REGULAR CONVERGENCE OF SEQUENCES OF 
2-MANIFOLDS.* 


By Gait 8S. Youne, JR. 


By results of Whyburn’s [5],1 if a sequence of 2-spheres, in a compact 
space, converges 1-regularly,? then the limiting set is either a point or a 
2-sphere; and if a sequence of 2-cells converges 1-regularly while the boun- 
daries of the cells converge 0-regularly, then the limiting set is either a point 
or a 2-cell. Generalizations of these have been made by Vaughan [3], 
White [4] and Begle [1]. Of particular interest to us here is this result ® 
proved by Begle: If a sequence of compact orientable 2-manifolds (without 
boundary) converges 1-regularly to a non-degenerate set P, then (1) P is 
itself a 2-manifold, and (2) P is homeomorphic to all but a finite number 
of terms of the sequence. 

In this paper I intend to extend the above theorems to the case of 
l-regular convergence of 2-manifolds in general. If we try to get Begle’s 
conclusion in the non-compact case, however, we find that the usual definition 
of 1-regular convergence is stronger than needed for (1) and not powerful 
enough for (2). To see this last statement consider the following example: 
Let M, denote the set in 3-space which is the sum of all the points in the 
zy-plane exterior te or on the circles (v + 2)?-+ y? =1 plus all the points 
with positive z-coordinates lying on the torus of revolution cutting z—0 
in these two circles. For each natural number n, let M, denote the image 
of M, under a translation sending each point (z,y,z) of M, into the point 
(c-+n,y,2). Then, in the usual sense {M,} converges 1-regularly to z 0, 
but no M, is homeomorphic to this plane. An example illustrating a slightly 
different trouble is furnished by the sequence of half-planes {z= 1/n, 


*Received May 10, 1948; presented to the American Mathematical Society, 
December 27, 1946 and September 4, 1948. 

* Numbers in brackets refer to the bibliography. 

*In a compact space, a sequence {A,,} of closed sets converges r-regularly to a set 
A provided that, first, A,,} converges to A, and, second, given e > 0, there exist numbers 
§>0 and N such that for 0Si=r, and n > N, every i-cycle of A, of diameter < 6 
bounds in A,, on a set of diameter < e. 

* Theorem 7 of [1]. In Begle’s paper, this appears as an application of more 
general results on convergence of sequences of generalized orientable manifolds. 

* By a 2-manifold, I mean here a connected metric space, each point of which has a 
neighborhood whose closure is homeomorphic to a closed 2-cell. The set of all points 
not having neighborhoods homeomorphic to open 2-cells is the boundary of the manifold. 


339 


. 


340 GAIL S. YOUNG, JR. 


x =—n}, which also converges 1-regularly to z=0. To remove these 
difficulties, I make the following definitions. 

We suppose throughout this paper that all sets considered are imbedded 
in a locally compact complete metric space S. We use Vietoris cycles and 
homologies on compact carriers only. 


DEFINITIONS: A sequence of closed sets {An} converges W-r-regularly 
to a set A provided that (1) lim A, <A; (2) for each point p of A and 
each « >0 there exist positive numbers § and N such that for each two 
integers n and k,n > N and0 =k Sr, every k-cycle of A, in the 8-neighbor- 
hood of p bounds on a subset of A, in the e-neighborhood of p. The sequence 
converges S-r-regularly to A provided that (1) given e>0, there is an 
integer N such that, for n > N, each point of A, is at distance < ¢ from 4, 
and each point of A is at distance <« from A,; and (2) given e > 0, there 
exist positive numbers 8, V such that, for n > N and 0 Sk Sr, every k-cycle 
of A, of diameter < § bounds on a subset of A, of diameter < . 

For compact spaces these definitions coincide with r-regularity. We can 
now state the principal results of this note. 


THEOREM 2. Let {M,} be a sequence of 2-manifolds converging W-1- 
regularly to a non-degenerate set M. Suppose that the (combinatorial) 
boundaries B, of the sets Mn are either empty or converge W-0-regularly or 
have an empty upper limit. Then M 1s a 2-manifold. 


THEOREM 3. Let {Mn} be a sequence of 2-manifolds’ converging S-1- 
regularly to a 2-manifold M. Suppose that either the boundaries of M and 
the sets My are empty, or the boundaries B, of the sets My converge S-0- 
regularly to the boundary of M. Then M is homeomorphic to all but a finite 
number of the sets Mn. 

We prove some preliminary results. 

Lemma 1. Let the closed set B be the limit of a sequence {Bn} of closed 
sets converging W-0-regularly to B. Then 

(1.1) ane Bn} and {yn;yne Bn} converge to the point x of B, 
then for almost all n, tn + Yn 1s contained in a component of Bn; 

(1.2) If for each n Cy is a component of Bn, and lim inf C, ~ 0, then 
lim inf C, and limsup Cy are each closed and open in B; 

(1.3) Hach component of B ts open in B; 

(1.4) If the convergence is S-0-regular, then for almost all n there ts 
a one-to-one correspondence ty, between the components of B and those of Bn 
with the property that if C 1s a component of B, then lim tn(C) =C. 


a 
0 
al 
de 
< 
ve 
a 
be 
us 
1. 
st 
{( 
Le 
P 
n 
ne 
to 
pa 
dis 
80 
an 
the 
of 
C01 
of 
the 
ho 
Fo 
co 


SEQUENCES OF 2-MANIFOLDS. 341 


Proof. Let « be chosen so small that the 2e-neighborhood U of 2 has 
a compact closure. For n large, diam (an + yn +2) is so small that every 
Q-cycle of Bn carried by + yn bounds in U-B,; and any irreducible 
carrier of this homology is a connected subset of B, containing 2 + yn. 
This proves (1.1). 

If lim inf C, = H, then H is closed. Let x be a point of H, and choose 
any « > 0; then there exist numbers 8, N, satisfying the conditions in the 
definition of W-0-regularity for 2 and e. Let y be a point of B at distance 
< 6/3 from a, and let {yn;yn€Bn} converge to y. Let {@n32n,eCn} con- 
verge to a. There is a number N’ such that, if n > N’, d(y, yn) and d(a, rn) 
are both less than 8/3. For n > N + N’, any 0-cycle of B, carried by tn + yn 
bounds in B, on a connected subset, of diameter <<. Hence yn also lies in 
(,, so that y is in lim inf Cn, proving that H is open. A similar argument, 
using a subsequence of {Cn}, proves that lim sup C,, is open, and so establishes 
1.2. I remark in passing that for non-compact sets, 1.2 cannot be 
strengthened, though for compact sets we can actually prove that lim inf Cy 
=lim sup Cn, using 1. 3. 

Since every component, C, of B intersects lim inf Cn for some sequence 
{Cn} of components of {B,}, it follows from 1.2 that C is in lim inf C,. 
Let x be any point of C and choose e¢, 8, N, y, {tn}, {yn} and N’ as in the 
previous paragraph, « being so small that the 2e-neighborhood U of 2 has a 
compact closure. Then if Ky denotes a subset of Bn of diameter <e, 
n>N-+N’, on which x, —yn irreducibly bounds, Ky is closed and con- 
nected, and so is lim sup Kn, from the compactness of U. Hence y belongs 
to C, and 1.3 follows. 

If the convergence is S-0-regular, the argument used in the previous 
paragraph shows that there is a 6 >0 such that if two points of B are at 
distance < 8, then they lie in the same component of B. We may choose 8 
so that, similarly, there is an N, the same for all points of B, such that if zn 
and y, are points of B,, n > N, which are within 8 of a point z of B, then 
they lie in the same component of B,. There is an N’ such that each point 
of Bn, n > N’, is within 8/3 of some point of B. Forn>N+WN’ if C isa 
component of Bn, then no point of C is within 8/3 of each of two components 
of B, for if two such components existed, C would intersect the boundary of 
the 8/3 neighborhood of their sum K, in some point. This is impossible, 
however, as then some third component of B has points within 28/3 of K. 
For n > N + N’, then, to each component of B, make correspond the unique 
component of B at distance < 8/3 from it. This satisfies the conditions of 1. 4. 


se 
d 
d 
y 
d 
0 
e 
n 
e 
) 


342 GAIL S. YOUNG, JR. 


THEOREM 1.° Let {Bn} be a sequence of closed sets converging W-0- 
regularly to a closed set B. If each component of each B, is a simple closed 
curve or an open curve, and no component of B is degenerate, then each com- 
ponent of B is also a simple closed curve or an open curve. If the convergence 
is S-0-regular, then B is homeomorphic to almost all {By}. 


Proof. If each point of B is of Menger order 2, the first part of the 
theorem is true. Suppose, then, that some point z of a component C of B 
is of order other then 2. If order «1, there is a neighborhood U of 2, 
with compact closure, such that B(U)-C is a point, B(U) denoting the set- 
theoretic boundary of U, and such that (S—U)-C 0. Then if for each 
n Cy is a component of B,, and lim inf (, contains (, we can choose two 
sequences {2p Cy} and Zn € Cn}, the first converging to x, the second 
converging to a point z of C not in U. One, two, or possibly three components 
of Cnu—2%n—2m meet B(U); in each such component an arc irreducibly 
contains its intersection with B(U). Let H, denote the sum of these arcs. 
If a, and b, are points of B(U) in different components of Hn, then no 
0-cycle carried by a, + 6b, bounds in C, Since B(U) con- 
verges to y, for n large every 0-cycle carried by an+ bn does bound in 
Cn —2n— %n, SO that for n large H, has just one component. Then H, 
clearly separates Cy, one component of C,— Hn being in U. Since an are 
cannot separate a simple closed curve, C, is an open curve, which contradicts 
the compactness of U. 

Suppose that order z >2. Then 2 is the junction point of a simple 
triod ax + bx + cx, since by Theorem 2.2 of [5] and Lemma 1, each con- 
ponent of B is locally connected. By Theorem 2.21 of [5], whose proof is 
still valid here, there exist sequences {Ain}, {Aon} and {Asn} of ares of Ch 
converging 0-regularly to arb, axc, and bxc, respectively. Since each C, 
is atriodic, these can be chosen to be non-intersecting. In each Ajn, select 
a point 2;, such that lim aj, =z. Then no 0-cycle carried by, say, tin + Xen, 
bounds in C,—B(Ain + Aon+ Aan). For n large, this contradicts W-0- 
regularity and proves the first part of the theorem. 

In view of 1.4 of Lemma 1, all that is needed to finish the proof of 
Theorem 1 is to show that no sequence of simple closed curves can converge 
S-0-regularly to an open curve and that no sequence of open curves can 
converge even W-0-regularly to a simple closed curve. Suppose, then, that 


5 A more general theorem concerning 0-regular convergence of certain 1-dimensional 
sets could be proved, but there seems no great necessity for it. If in Theorem 1, each 
component of each B, were a simple closed curve, Theorem 1 would follow from Lemma 
1 and Theorem 3.2 of [5]. 


th 


cu 
Wi 
su 
al 
a 
pe 
F 
fr 
pe 
a 
cl 
se 
t 
{ 
0 
i 
t] 

8 
i is 
a 
t 

t 


SEQUENCES OF 2-MANIFOLDS. 343 


the sequence {J,} of simple closed curves converges S-0-regularly to the open 
curve L. In L there is a countable closed set H having no limit point and 
with the property that each point of ZL is between some two points of H 
(which corresponds to the set of all integers in the number line). No infinite 
subset of H forms a Cauchy sequence, since A is complete, so that there is 
an e > 0 such that no infinite subset of H is of diameter < 2e«. There is 
an NV such that each set J,, n > N, is within «/2 of LZ and such that each 
point of Z is within ¢«/2 of each such J,. Let {2} be an ordering of H. 
For each integer 7, let y; denote a point of Jn, n > N, at distance < ¢/2 
from z;. Some point y of J, is a limit point of Sy;. Then infinitely many 
points 2; are within « of y and hence form a subset of H of diameter < 2¢, 
which is impossible. 

From the local compactness of our space, every simple closed curve has 
a neighborhood with compact closure. As an open curve is required to be 
closed, such a neighborhood cannot contain one, which would dispose of the 
second case. However, this argument rather avoids the issue, and it is true 
that the assumption of closure is not needed. Suppose, in fact, that a sequence 
{O,} of homeomorphs of the line, not necessarily closed, converged W-0- 
regularly to a simple closed curve J =axb + ayb. Examination of the proof 
of Theorem 2. 21 of [5] shows that in each O, there exist arcs AnXnDn, GnYndn, 
such that {@ntnbn} and {dnYnbn} converge 0-regularly to azb and ayb, 
respectively. For n large, this implies the existence of a simple closed curve 
in On, which is impossible. 

The circles x? + y? — 2ny = 0 converge W-0-regularly to y = 0, showing 
that S-0-regularity was needed in Theorem 1. 


Proof of Theorem 2. I have proved in [8] that a 2-manifold, N, is 
characterized among locally compact, locally connected and connected metric 
spaces by these two conditions: (1) N has no local cut points, and (2) if C 
is a compact subset of NV, there is an « >0 such that every simple closed 
curve in C of diameter < separates N. Hence we need only show that M 
satisfies these conditions. However, in the proof of his theorem mentioned 
above, Begle has given an argument which needs only slight modifications 
to prove that M satisfies (2), so that a proof is needed only for (1). Con- 
dition (1), however, is not trivial, and the reader may perhaps follow the 
rest of the argument better if he keeps in mind the following two examples. 
The first is formed by the sequence of sets 1—1/nSa?+y?=1+1/n, 
which converges 1-regularly to the unit circle, though the boundary does not 
converge 0-regularly. The second is formed by letting M denote a pinched 
torus and {M,} denote a sequence of tori converging smoothly to M, the 


(). 

ed 

ce 

B 

0 

) 


344 GAIL S. YOUNG, JR. 


convergence being 0-regular, and failing to be 1-regular only at the local cut 
point of M. 

Suppose, then, that some point, x, of M were a local cut point of M. 
There is a neighborhood U of 2 with compact closure such that (1) letting 
Un =Mn-U, for almost all n every cycle of M, that is carried by a subset 
of U, bounds in M,; (2) M-U is connected, but some two points, a and 3, 
are separated in M-U by x; (3) for each n, U, —B,n-U contains two points, 
and bn, for which lima, =a, limb,—b; (4) a, 6b, and are so close 
that for n large any 0-cycle of M, carried by adn + ba bounds in Uy, so that 
a, and by lie in the same component of U,; and (5) for no n does Un contain 
all of a component of Bn, nor does it contain points of two components of By. 
The second part of (5) follows from Lemma 1. The first part of (5) follows 
from the fact that if it were not so, we could find a sequence of components 
of Bn containing a subsequence converging to x; since a cycle carried by a 
component of B, can bound only on all M,, if at all, W-1-regularity would 
imply lim inf M, = 2, an impossibility. Let V be a neighborhood of 2 whose 
closure lies in U—-a—b. Then, for n large, V- Mn separates a, from Dy 
in Un. Otherwise, for infinitely many n, there is an are dnbn in Un— Mn- V, 
and lim sup dnb» is a connected subset of M- (U —V) that contains a and b 
but not z. There is no loss in assuming that V is so small that for some open 
neighborhood R of V, and for n large, every 0- or 1-cycle of R- Mn bounds 
in Un—d,— bn, and every 0-cycle of Bn: R bounds in Bn: Un — adn — Dn. 
Choose one such value of n. By considering a triangulation of Mn of small 
enough norm we can obtain a closed set C,; which is the sum of a finite 
number of cells of the triangulation and which lies in R- Mp and contains 
V -M, in its interior—possibly C, is not connected. There is no difficulty 
in assuming that each component of the closure of the mod 2 boundary of 
Cn (considered as a complex) is a simple closed curve,® though the mod2 
boundary of C, may be larger than the set-theoretic boundary if Cn: Bn 0. 
Now I assert that one such simple closed curve separates dn from bn in Un. 
For suppose not, and let dnb, be an arc in Un. Let J be the first component 
of the closure of the mod 2 boundary of C;, that anbn intersects. Since any 
cycle carried by J bounds in U,n—an—b, there is a connected set Dn in 
Un — Gn — bn, open rel. My, whose boundary rel. Mn is J. Let p and q be the 
first and last points of anb,-J in order an to bn in dnbn. Then neither dnp 
nor dng is in Dy. By definition of J, the component of Un—J containing 
dnp — p cannot lie in Cy. But every point of J is a limit point of Cn, 80 


° Cf. Theorem of Roberts and Steenrod [2], for example, which, while proved only 
for a manifold without boundary, can clearly be extended. 


| 

0 

e 
( 
t 
a 
eX 
b 
J 
is 
ta 
SO 
Cé 
of 
1 
di 
A 
| @ 
bo 
of 
is 


SEQUENCES OF 2-MANIFOLDS. 345 


that D, must intersect Cn, and, indeed, every point of Dn near enough to J 
is in Cn. Hence the points of qb» near q lie in Un—Cn, J being in the 
closure of the mod 2 boundary of Cn. Now an-+ bn lies in one component, 
K, of Un—dJ. Unless J intersects Bn, K has all of J for boundary, and 
even if J intersects B, in more than one point, K has either all of J or an 
arc of J for boundary, a statement that requires some argument. First, by 
the condition that R is so small that every 0-cycle of R- B, bounds in Bn: U, 
and the fact that U is so small that no component of B, lies entirely in U, 
we can find a unique are 7 of B,-U that irreducibly contains J- Bn. 
Exactly one component, C, of U,—JJ both lies in U, and has all of J for 
boundary. If Z is a component of J —B,-J then L spans an arc A of T, 
and ZL + A is a simple closed curve separating U, into exactly two connected 
domains, one of which contains (, and the other of which has all of Z+ A 
for boundary, and is the only one of the two that does. Either K is C or is 
one of these components of Un—J whose boundary is such an are Z; in 
either case, our remark is proved. Using this result, then, it is possible to 
find an are from anp to gb,» in K which is close enough to J not to intersect 
Cn, thus constructing an arc d»b, not intersecting J at all. Repetition of 
this argument gives finally an arc dnb» not intersecting C, but lying in Un. 
Hence there is a simple closed curve, Jn, in the mod 2 boundary of (,, that 
separates a, from b, in U,. Even more than this, some arc An of J» separates 
a, from by» in Un, for if Jn irreducibly separates Un, it separates it into two 
connected sets, one of which lies in Un—dn—Dn, since any cycle carried 
by Jn bounds in that set. Of course, the existence of A, implies that 
Jn* By 0, so that, as before, there is an arc 7, of B,-U such that An + Ty 
is a simple closed curve which separates U, into two connected sets, one con- 
taining dn, the other b,. It will be recalled that B, does not contain ap or Dp, 
so that T, contains neither. We now have a contradiction, since An + Tn 
cannot bound in U,—a,—bn. Hence M has no local cut points, which 
completes the proof of Theorem 2. 


Proof of Theorem 3. We shall form a particularly desirable triangulation 
of M. There exist numbers 7 > 0 and N, such that, for n > N,, every 0- or 
l-cycle of M, (B,) of diameter < 3y bounds on a subset of M, (Bn) of 
diameter <1/3 diam M,, (B,). (There are no small non-zero 1-cycles in B,.) 
As may be seen by an argument similar to that of Lemma 1.1 of [5], there 
exist numbers y, y’ such that every O-cycle of M (B) of diameter < y’ 
bounds on a subset of M (B) of diameter < y/3, and such that every 1-cycle 
of M of diameter < y bounds on a subset of M of diameter < y. In M there 
is a set of points {v;} such that (1) every point of M is within y’/2 of some 


Vv 
‘ 


346 GAIL S. YOUNG, JR. 


vi; (2) every point of B is within y’/2 of some v; in B; and (3) no two 
points v; are closer than y’/4. These will be the vertices of our triangulation, 
A, with a 1-dimensional skeleton Z. If two points v;, vj are in B, with 
d(vi, vj) < y’, and if no other v; is on the unique are v;v; of B with diameter 
< y/3, then put viv; in L. If no more than one of v;, vj is in B, we can 
select an are v;v; of diameter < y/3, having at most one end point in common 
with B. By well known theorems on 2-manifolds, we can select all these arcs 
so that each two have no more in common than an end point. Put all these 
in L. Now let J = v,v2 + vov3 + 0,03 be a simple closed curve in L. Then 
diam J <y, so that any 1-cycle carried by J bounds on a subset of M of 
diameter <7. Using this, and the definition of L, we see that J separates 
M into two connected open sets, one of which, D, has diameter < 7 and has 
all J for boundary. With the help of Theorem 3.1 of my paper [8] and 
the properties of y, it follows that D+ J is a closed 2-cell. It is then easy 
to see that L-(D-+ J) defines a triangulation of D+ J. It follows readily 
that Z is the 1-dimensional skeleton of a triangulation A of M of norm < 9, 
and that every 1- or 2-cell of A has diameter > y’/4. 

As has been said before, Whyburn has proved in Theorem 2.21 
of [5] that every arc A of M is the limit of a sequence of arcs, one from 
each M,, that converges 0-regularly to A. Examination of this proof will 
show that, by the aid of uniformity properties of S-1-regular convergence, we 
can select for each closed 1-cell, C, of LZ a sequence {Cn} of arcs converging 
0-regularly to C, such that (1) for each n, Cy is in Mn; (2) if C is in B, the 
boundary of M, then C, is in B,; and (3) given « > 0, there is an N such 
that for each such 1-cell, C, and n > N, each point of Cy is within « of (, 
and each point of C is within « of C,. Since each 1-cell 01 Z is of diameter 
> y’/4, it follows from this and (3), that for n greater than some number 
N., the correspondence between the cells C and the arcs Cn is one-to-one. 
Hence, for each fixed n > N2, the sum of the sets Cn, summed over the 1-cells 
of L, is an approximation, in some sense, to L; call this sum L’n. I will 
modify L’, to make it homeomorphic to L, for n large. First, it is clear— 
from Whyburn’s argument, if nothing else—that we can assume that if {Cn} 
and {(0’,} are the sequences chosen to converge to 1-cells of Z with a common 
vertex, then Cn-C’n contains an end-point of each. Let x be a vertex of L, 
and let 4:,- - -, yx be the vertices of ZL which can be joined to x by a 1-cell 
of L. For each j, 1 jk, let {tnyjn} denote the sequence of arcs chosen 
above to converge to zyj, Zn converging to x. If for each n, da(ax) denotes 
the l.u.b. of distances d(an,2), where z is any point common to two arcs 
InYjn, then dn(x) converges to zero with n, uniformly in x, by (3) of our 


third sentence. There is an N; such that for n> Ns, and for any 4 [| 


( 
| 
j 
W 
Ww 
t 
is 
t 
yi 
to 
ve 
to 
1-( 
r 
Col 
pol 
of 
loc 
T 
be 
to 
sin 
(M 
T 
one 
of ] 
is j 


SEQUENCES OF 2-MANIFOLDS. 347 


dn(v) 1/4 min yjn), the minimum being with respect to 7. For such 
an n, I will modify L, at x so that the new ares anyj, will have only x» in 
common, by the following construction. If ¢ny:n° %nYon ~ Un, then one of these, 
say TnYon, is not entirely in By, since each point of Bn is of Menger order 2. 
Using standard accessibility and separation properties of 2-manifolds, we can 
construct an are Xp2on, With Zon IN LpYon, Of diameter < 1/4 dy(x), that has 
just @, In common with 2ny¥:n + Bn, and such that YonZon + Zen%on iS an are. 
Suppose now that for 1 =[i=j—1<k, arcs 2x,2i, have been constructed 
so that diam tn2in << 3/2 dn(x), no two have more than z, in common, and 
so that no one intersects B, unless it lies therein. Then if 2,yj, is not in Bn, 
we can construct an are Zjn%, from a point Zjn in LnYjn tO In, of diameter 


j-1 
< 3/2 dn(x) and having only x, in common with >} anzinYin + Bn. (Roughly, 
4=1 


what we do is start from zj;n, follow x,yj, along until we meet either the first 
modified arc or Bn, then run along next to that are to 2.) Of course, if 
tn¥jn iS in By, nothing need be done to it for the induction. Proceeding in 
this manner, we can therefore modify LZ’, at a, so that L’, as thus modified 
is locally homeomorphic to Z at x But by the uniformity of 9-1-regularity, 
this process could be carried out simultaneously at every vertex of Ln, 
yielding, for each n sufficiently large, 1-complexes LZ, which are homeomorphic 
to Z, under a homeomorphism sending each vertex of ZL, into the nearest 
vertex of L. The sequence {Zn} can easily be shown to converge S-0-regularly 
to L. 

By the definition of y, for n > Ni, if 7,42, Yex3, and 2,73 are the (closed) 
1-cells of the boundary of a 2-simplex of the triangulation A, then the subset 
T = XynXon + LonLgn + LinXgn Of Ly bounds an open set D in My, with T + D 
compact. The set 7 + D cannot contain a component of Bn, for such a com- 
ponent can bound only on all of M,, if then, and this contradicts the definition 
of », for n large enough. But, this being so, it is clear that 7+ D has no 
local cut points. Further, every simple closed curve in 7 + D distinct from 
T separates T + D. To see this, let J be such a simple closed curve and let x 
be a point of 7—7T-J. There is an are zy in (M,—T—D) +2 from & 
to a point at least 1/2 diam /, from J. This are does not intersect J, and 
since J bounds an open set of diameter < 1/2 diam M,, (T+ D) and 
(M,— £)- (T+ D) both exist. Hence by Theorem 3.1 of my paper [8], 
T+ D is a 2-cell. 

Now clearly the homeomorphism between Z and Ln can be extended to 
one between M and a subset K of M,. We need only prove that every point 
of M, is in K. By Theorem 1, and the definition of LZ, every point of By 
isin Ly. Then K is an open subset of My, since the star of every point of K 


| 
’ 


348 GAIL S. YOUNG, JR 


in a 0- or 1-cell of K — By, is homeomorphic to an open subset of M—B. 
On the other hand, the diameters of the closed 2-cells forming K are bounded 
away from 0, by the definition of A, so that K is closed. Hence K is M. 

I will point out, in passing, that just as Theorem 1 gives us information 
about 0-regular convergence of sets whose components are 1-manifolds, so we 
can prove a theorem on 1-regular convergence of sets whose components are 
2-manifolds. The nature of such a theorem is so obvious that I will not 
state it. 

Whyburn has proved that a sequence of 2-spheres (or 2-cells) converging 
0-regularly converges to a cactoid (or hemi-cactoid), in other words to a set 
that can be obtained from the original sets by a monotone transformation. 
This is not a property of 2-manifolds in general. In other words, it is false 
that a sequence of compact 2-manifolds, converging 0-regularly, converges 
to a set that can be obtained by a monotone transformation from any com- 
pact 2-manifold whatever. Consider, for example, the set M which is the 
sum of the intervals (0,1) of the z- and y-axes, and of each of the intervals 
joining (1/n,0) to (0,1/n). For almost all null sequences {e,}, the set M, 
of all points «, away from M is a 2-manifold, and the sequences Mn converge 
0-regularly to M. However, M has infinite coherence, whereas a monotone 
transformation can only lower coherence. It seems probable that a sequence 
{M,} of homeomorphic manifolds converging 0-regularly converges to a 
monotone image of its elements, but I have not tried to prove this. For 
further discussion of the 0-regular convergence problem, see Whyburn’s paper 


[6]. 


UNIVERSITY OF MICHIGAN. 


BIBLIOGRAPHY. 


1. E. G. Begle, “ Regular convergence,” Duke Mathematical Journal, vol. 11 (1944), 
pp. 441-450. 

2. J. H. Roberts and N. E. Steenrod, “ Monotone transformations of two-dimensional 
manifolds,” Annals of Mathematics, vol. 39 (1938), pp. 851-862. 

3. H. E. Vaughan, Abstract no. 217, Bulletin of the American Mathematical Society, 
vol. 42 (1936), p. 337. 

4, P. A. White, “On r-regular convergence,” Bulletin of the American Mathematical 
Society, vol. 50 (1944), pp. 123-128. ; 

5. G. T. Whyburn, “On sequences and limiting sets,” Fundamenta Mathematicae, 
vol. 25 (1935), pp. 408-426. 

6. ———, “Regular convergence and monotone transformations,” American Journal 

of Mathematics, vol. 57 (1935), pp. 902-906. 

, Analytic Topology, American Mathematical Society Colloquium Publica- 

tions, vol. 28 (1942). 


7. 


8. G. 8. Young, “A characterization of 2-manifolds,’ Duke Mathematical Journal, 
vol. 14 (1947), pp. 979-990. 


Qn 


ex 


De 


| 
| | 


ON THE APPROXIMATION OF IRRATIONAL NUMBERS BY THE 
CONVERGENTS OF THEIR CONTINUED FRACTIONS.* 


By ALFRED BRAUER and NATHANIEL MACON. 


Introduction. Let é be any positive irrational number. We shall con- 
sider the expansion of é as the regular continued fraction 


Let us denote the numerators and denominators of its convergents by An 
and B, respectively (n = 0,1, 2,---), and set 


(1) | €— Aw/Bu | = 1/(AnBu?). 


Hurwitz [10] proved that for any given é there exist infinitely many 
An > V5, and that there exist numbers é for which only a finite number of 
the A, are greater than c for any c> V5. Vahlen [16] showed that for 
every é at least one of two consecutive A, exceeds 2, and Borel [1] proved 
that 
max (An; Ansz) V5, 


for every n. These results were improved for special cases in which conditions 
were imposed upon the partial quotients qn. ; 
Let us assume that gn..2= 2. It was shown by Humbert [9] that 


max (An; An+1) Anse) V8, (n — 0, 1, 2, 


and by Fujiwara [3] that either An. > 2.5 or 
min (An; Ansz) > 2.5. 


Moreover, the following theorems were obtained by Fujiwara [3, 4]: 

Tf = 2 and = 1, then at least one of the five numbers An, Ans, 
Anssy Ansa iS greater than (2 + 5V10)/6. 

If € is equivalent to neither $(1+ V5) nor 1+ V2, if dng 2 and 
Qnig = 1, and all the q; are less than 3, then 


Max (An, Ansiy Ans, Ansa, Ansa) > V 221/5, 
except for 


Qn+3> | [2, 1,1, 2, 2, 1, 1, 2]. 


* Received May 27, 1948. Presented to the American Mathematical Society, 
December 28, 1948. 


349 


3. 
7e 
re 
n. 
se 
1e 
Is 
| 
a 
oT 
er 
a] 
nal 
ue, 
al, | 


350 ALFRED BRAUER AND NATHANIEL MACON. 


Other proofs of these results were given by Perron [13], Hardy and 
Wright [7], L. R. Ford [2], Kurosa [12], Fukasawa (Morimoto) [3, 6], 
and Shibata [15]. For a more detailed history of the problem see Koksma 
[11]. 

In this paper, we shall prove the following extensions of Borel’s theorem: 

For every irrational number é, either at least two of each set of five 
consecutive A, exceed V5, or at least one of them exceeds 3. 

Let ¢ be any number of form 3m + 2. In each sequence of ¢ consecutive 
An, there are either at least m+ 1 elements which exceed 5 or m elements 
which exceed 3. 

These results will be improved further for the case in which all of the 
partial quotients are less than 3. Moreover, we shall consider some further 
relations between consecutive numbers An. In particular, we shall prove that 
for every n 

min (An; Ans2) > Aner/ (Ane 1), 
and 
AnAnsi > An + Ans > Max {An?/ (An = (Ansar = 1)} > 4. 


Heawood [8] and Perron [14] investigated those numbers é for which 
the upper limit of the A, is less than 3. It follows here that for each such 
number, only a finite number of A, may be less than 3/2. 

Finally, estimates for the sums of three, four, and eight consecutive An 
are obtained. It follows from these results that 


lim (S.A) /m = (31 + 155) /32 > 2.0169. 


1. The theorems of Vahlen and Borel and some consequences. We will 
need the following well-known facts about continued fractions which may 
be found in Hardy and Wright [7], p. 150 ff. or Perron [13], p. 48 ff. 

Let us denote the m-th complete quotient, [qm,Qmi:,° °°] of the con- 
tinued fraction € by &ém and set 


(2) 1 = 0, and Pm = Bun-2/Bm-r, (m =m 2, 3, 7" *). 
Then we have 


| é— Am/Bm | 1/Bm? + dm+1) 1/Bm? + 


Hence 
(3) Xn + Pms1 dms2 


Moreover, we have 


(4) | E— Am/Bm | | Amu/Bma | == 1/BaBmas- 


| 
] 
| 


APPROXIMATION BY CONVERGENTS. 


Two consecutive values d» and dm satisfy the condition 


(5) min <4#(V5—1). 

Finally, if we denote the partial quotients by gm (m=—0,1,2,---), then 
(7) = (Fm + dm); (m= 1,2,- 


Let us now prove the following lemma, which is due in part to 
Fujiwara [3]: 


LemMA. For all n, we have 
(8) = {1 — (1— 

Proof. It follows from (1) and (4) that 

1/AnBn? + 1/Ans Bair? = | €— An/Bn | + | | 
and, multiplying by AnwBn* and applying (2), we have 
(10) — + Ansi/An = 0. 
Moreover, by (3) 


Hence, 


(11) Ensohnse /An- 


If we consider (10) as a quadratic equation for ¢n42, it follows from (3) and 
(11) that &42 and ¢n.. are the roots; and since &n42 > 1 > dnie, the lemma 
is proved. 

Fujiwara obtained from (8) the following proof of Vahlen’s Theorem: 
since ¢dy,. is real, we have 


max (An; Ane) > 2. 


Now we shall prove the following somewhat modified version of Borel’s 
Theorem : 


THEOREM 1. If 


(12) < 3(V5—1) 
and A» = V5, then 


An + Ans 2V 5, 


351 


352 ALFRED BRAUER AND NATHANIEL MACON. 


that is, under the assumption (12), either A» > V5 or Ana > V5. 
It is seen from (5) that Theorem 1 contains Borel’s Theorem. 


Proof. The function 2! is decreasing for r= 4(V5—1). Thus 
it follows from (3) and (12) that 


Since A, = V5, we have 

+ S V5, 

S V5 < V5—3(V5+41) =3(V5—1). 


Hence, from (13), 
An > 2V 5. 


THEOREM 2. If =} then —1. 


Proof. The theorem follows at once from (7), since Qnii + ni S 2. 


The following extension of Vahlen’s theorem is now an immediate 


consequence : 
TuroreM 3. If every partial quotient is greater than 1, then 
(14) max (An; Anu) > V5 (n = 0, 1, -). 


Proof. By Theorem 2, we have dno << $< 4$(V5—1) for every n; 
and so (14) follows from Theorem 1. 


2. The sum and the product of two consecutive A,. It follows from 
(8) that AnAnu > 4. More exactly, we have 


THEOREM 4. The following inequalities hold for every n: 
AnAnsr > An + > Max {An?/An — 1), Ansr?/ (Aner — 1)} > 4. 
Proof. Two cases arise. 
Case I. Ani > 2. 
Since ¢ni2 <1, it follows from (8) that 
Aner (1 — 4/AnAnsr) > — 23° 


and so 


(15) 


/An > Rosa” + 4, 
Xn + AnAnss- 


i 
E 
01 


APPROXIMATION BY CONVERGENTS. 353 
Hence, 
(16) An > Ansr/ (Aner — 1) and > An/(An—1). 
Moreover, we obtain from (16) 


(17) An + An+1 Aner / 1) + An+1 1) > 4, 


since the function 2*/(x— 1) assumes its minimum for z = 2, and, by (1), 
Ant 18 irrational. Similarly, 


(18) An > An + An/ (An — 1) = An?/(An —1) > 4. 
Thus for Case I, the theorem follows from (15), (17), and (18). 
Case II. Ana <2. 
Since én42 > 1, we have, from (9), 
Ansa (1 — > 2 — Anais 
and the proof is exactly as in Case I. 
From (16) we obtain at once 
TuHeroreM 5. For all n, we have 
min (An; Anse) > Ansr/ (Aner — 1). 


Corottary 1. Jf only a finite number of the dr» exceed 3, then only a 
finite number of them are less than 3/2. 


CorotnaRy 2. Jf An< V5 and Anu < V5, then 
min (An; Anu) > (5 + V5) /4. 
Proof. This follows at once from the fact that An/(An—1) and 


Ans1(Ans1 — 1) are monotonic decreasing. 
THEOREM 6. If dniz $(V5—1), then > An + Aner > 2+ V5. 
Proof. On putting m—n-+1 in (3), we have from, Theorem 4, 


AnAnsr > An = + Ense Gnse 


Hence 


> An Anu > 2+ 
CoroLuary. For every n, we have either 


An+iAn+e > An+1 + An+2 2 +- V5, 
or both 


> An Aner > 2+ VS and AnsoAnis > Anse + Anis > 2+ V5. 


e 
n 


354 ALFRED BRAUER AND NATHANIEL MACON. 


For two important special cases, we have the following improvement of 
Theorem 6: 


THEOREM 7%. If $(V5—1) and Any = V5, then 


(19) An + Anis > 2V 5, 
and 
(20) > 


Proof. If An S V5, then (19) follows from Theorem 1. For A, > V5 


(19) is obvious. 
In order to prove (20), we note from (8) that 


— Anu {1 (1 — 4/AnAnss) 3} < V5 —1 5 
and so 
V5 1 < Ane (1 4. 
The left member is positive, and so 
Aner” 6 2V5 ( V5 1) < /Ans 
An {Aner (V5 1) (3 V5) } > 
Since the coefficient of A» is positive, we have 
An > RAna/{(V5 1) Ane —3 + V 5}. 
Hence 
(21) AnAnsr > 2Ansi?/{ (V5 —1) Anu — 3 + V5}. 
The right member is differentiable for 
Anu F (38 — V5)/(V5—1) = 3(V5—1) 
and has minima at Agw=O0 and Agi—= V5—1. Thus it increases for 
Ane1 = V5; and, from (21), we obtain AnAnes > 5. 
CorotuaRy. Jf V5 and AS V5, then > 5. 


THEOREM 8. If and dns << $(V5—1), then 


(22) An + Anus > 2.5 + VI 
and 
(23) > (11 + 5 V5) /4. 


Proof. We have, from (7), 
Qn+2 + — 4(V5 + 1), 


and so 


quiz > 3(V5+ 1) —3(V5—1) =1. 


Ar 


st 
B 
(2 
T 
(2 
th 
SCC (2 
(2 
anc 


APPROXIMATION BY CONVERGENTS. 


Hence, by (6) 
(24) bis > 
It follows, then, from (3) and (24) that 
An + Ansa = + 
> 2.5 + + > 2.5 + Vz 
Similarly, we have 


= 2 + + Ense/Pns2- 
But, by (24), 
Enso/bniz > 4/(V5—1) = V5 +1, 
so that 
> 2+ VO+14 (V541)7 = (114 5V5)/4. 


Remark. It follows from (24) and Theorem 2 that the assumptions of 
Theorem 8 are possible only if ¢nis < 4. 


3. The main theorems. We are now able to prove 


THEOREM 9. In each sequence of five consecutive terms, An-2, An-1, An; 
Ansty Anse, either at least two terms exceed \/5 or XA» is greater than 3. 


Proof. If only one of the five numbers is greater than V5, then, by 
Borel’s Theorem, it must be A»; whence it follows from Theorem 1 that 


(25) min dns) > $(V5—1). 
Then, by (5), 
(26) max (dn-15 Pn+15 $( V5 1), 


thus, by (24), 

(27) = 23 
and, by Theorem 2, 

(28) Gna = = 1. 


Moreover, by (3), 
An-1 En + dn < V5, 


and, by (25), 


<< V5—4(V5—1) =4(V5 +1). 


355 


356 ALFRED BRAUER AND NATHANIEL MACON. 


Hence we have, from (6), 
(29) Qn ay 
By repeated applications of (6) and (7), we may write é:,, and dn, as 
continued fractions and obtain, by (27), (28), and (26), 
(30) An = + oan = | + [0, + gn-1] 
In exactly the same way, we obtain the following result, of which 


Theorem 9 is a special case: 


THEOREM 10. Let t¢ be any integer of form 3m+2. Each sequence 
of t consecutive r, contains either at least m-+-1 terms which exceed V5, 
or at least m terms which are greater than 3. 


We shall improve Theorem 10 for a particularly interesting special case, 
namely that in which all the partial quotients g, are less than 3. We under- 
stand by [(a, 8, y)p] the finite periodic continued fraction B, y, 2, B, y, 
where the period occurs p times. 


THEOREM 11. Let é be an irrational number having only 1’s and 2’s in 
its expansion as a continued fraction and t be any integer of form 3m + 2. 
In each sequence of consecutive terms, * * Ansam-1, etther 
there are at least m+ 1 terms which exceed \/5, or the following relations 
hold for »=0,1,2,---,m—1: 


if m and p are both even, then 

Ansap > [(2, 1, 1) m-1-p 2, 1,3] + [0, (1, 1, 2)p, 1, 4/3]; 
if m is even and p 1s odd, then 

An+sp > [(2, 1, 1) m-p] + [0, (1, 1, 2) ps 1, 4(V5 + 1)] 
if m is odd and yp 1s even, then 

An+sp > [ (2, 1, 1) m-p] + [0, (1, 1, 2) ms 1, 4/3] 
and finally, if m and p are both odd, then 
Anesp > [(2, 1,1) m-r-p, 2, 1, 3] + [0, (1, 1, 2)p,1,4(V5 +1)]. 


Proof. If only m of the terms exceed V5, then, as in the proof of 
Theorem 9, we have 


= 2, (u = 0, 1,- -,m—1), 


= 1, (u=0,1,---,m—1), 


a 
a 
( 
a 
( 
( 
( 
( 
I 
t 


as 


APPROXIMATION BY CONVERGENTS. 357 


and 
Qn-1+3n = 1, 0, *,m). 


As in (30), we obtain 
Ens = [2, 1, Enss] = [2, 1, 1, 2,1, Enso] 
= [ (2, 1,1) ms, 2, 1, Enssm]. 
Since 3 > énsigm > 1, by (6), we have 


> [(2, 1, 1).m-s, 2, 1,1] [(2,1,1)m], if m is odd; 
and 
> [(2, 1, 1) m-1, 2,1, 3], if m is even. 


More generally, we obtain for »—0,1,---,m—1, 

(31) Ensisop > [(2,1,1) mp] if is odd; 

and 

(32) > [ (2,1, 1) m-p-1, 2, 1,3] if m—yp is even. 
Similarly, we have for »=0,1,---,m—1 


(33) Pn+i+sp [0, (1, 1, 1, Qn-1 Ly, (1, 1, 2)p, 1, 4/3] 

if w is even; 
since 1/3 < ga1 << $(V5—1), by (7) and (26). Finally 
(34) > [0, (1,1, 1,9(V5 +1)] if is of. 


The theorem follows at once from (31), (32), (33), and (34). , 


4. The sum of more than two consecutive \,. 
THEOREM 12. For every n, we have 
An An+1 + An+2 > 4(5 + 3V 5). 


Proof. By Borel’s Theorem, at least one of the three numbers An, Ans, 
Ane is greater than V5. First we assume that An > V5. It follows from 
Theorem 4 that Ans Anse > 4. Hence 


(35) An + + > 


If Anso > V5, the proof is exactly the same. Thus we need to consider only 
the case in which An, > V5. By (5), at least one of the numbers ¢n,. and 


), 
5 
n 


358 ALFRED BRAUER AND NATHANIEL MACON. 


dnsz iS less than $(V5—1). Let us first assume it is da... It follows from 
Theorem 6 that 


(36) An > (2 + V5) 

and from Theorem 5 that 

Hence, by (36) and (37), 

(38) An Ansa + > (2 + V5) Ansa Ansr/ — 1)- 


Let us write A for Ay,; and denote the right member of (38) by f(A). 
The maxima and minima of f(A) are taken at the real roots of 


g(A) = At — — (24+ + 2(2 + V5)A—2— V5 —0. 


We have 
(A) = 448 — 6A? — 2(2 + V5)A4 2(24+ V5), 
g’’(A) = 1242 — 12A— 2(2 + V5). 


It is sufficient to consider > V5, for A Now g”(A) >0 forrA> Vi 
and so g’(A) is monotonic increasing for A> V5. Since g’(V5) >), it 
follows that g’(A) > 0 for A> V5. "“Tence g(A) can have at most one root 
greater than V5. On the other hand, A= 4(3-+ V5) isa root of g(A) =0. 
Hence if A> V5, f(A) has its only minimum at A=4(3+ V5). Thus 
it follows from (38) that 


(39) - An Ansa + Anse > + V5)} =3(5 4 3V5). 


Secondly, if dns << $(V5—1), then the left-hand sides of (36) and 
(37) must be interchanged. But (38) and the proof of (39) remain the 
same. Theorem 12 now follows from (35) and (39). 

Tf Anus > $(3 + V5), we may use (37) instead of (36) for A, and Anw 
and obtain 


(40) An + An+2 (Aner 1) An+1) 


which is sharper than (38). For Ans = 43(3+ V5), (40) gives us the same 
estimate as does (39). } 
Now let us consider four consecutive dj. 


THEOREM 13. For every n, we have 


An + An+1 + An+2 + An+s > (21 + 5V 5) /4. 


( 
a 
( 
I 
I 
E 
( 
E 
( 
T 
F 


ov 


APPROXIMATION BY CONVERGENTS. 


Proof. We must distinguish between two cases. 
Case I. Only one of the four A; is greater than V5. 
Clearly either Anu > V5 or Ane > V5. 

(a) If Anu > V5 then, by Theorem 4, we have 


(41) 


(42) An An+1 > 1) 5/(V5 1) 5(V5 1) /4. 
Hence, 
An + An+1 + An+2 + Anss - (21 + 5V 5) /4. 
(b) If Anw > V5, the proof can be obtained by interchanging the 
left-hand sides of (41) and (42). 
Case II. At least two of the A; are greater than V5. 


(a) max (An; > V5, and max (Ans2, > V5. 
From Theorem 4, we obtain 


An + Ans a” 5(V5 + 1) /4, 
Anse + Anis > 5(-V5 4 1)/4. 
Hence 
(43) Aner + Anse + Anes > 5(V5 +1)/2 > (21 + 5V5)/4. 
(b) min (An, Ani) > V5. 


Here we have 


An 2V 5, 


An+2 Anss > 4, 
Hence 


(44) An + Aner Anse + Anis > 4+ 2V5 > 5(V5 + 1)/2 > (214 5V5)/4. 
(c) min (Ansa, Anis) > V5. 


The proof here is exactly as in case (b), and so the proof is complete. 
From Theorem 13 we obtain easily 


THEorEM 14. For an arbitrary irrational number & we have 


m-1 


lim (21+ 5V5)/16 > 2. 


and 
| 


360 ALFRED BRAUER AND NATHANIEL MACON. 


Proof. Let k be an integer such that 44 = m=4k+ 3. Then 
m-1 4k-1 


(DAs) /m (TDA) /m k(21 + 5V5)/4m +5 V5)/(16k + 12). 


Theorem 13 can be improved by using larger sequences of consecutive d,. 
We shall consider only sequences of eight. 


THEOREM 15. The sum of any eight consecutive 4 1s greater than 


(31 + /4. 


Proof. Let An, * be the sequence in question. Since 
(31 + 15/5)/4 — (214 5V5)/44 5(V5 4 1)/2, 


it follows from (43) and (44) that we need to consider only the case in 
which the first four and the last four of the A; each contain only one term 
exceeding V5. In this case it follows from Borel’s Theorem that Ane > V5 
and Anus > V5, and all the others are less than V5. Thus, by Theorem 9, 
Anse > 3 and Anis > 3. Since An + Aner > 4, Ants + Ansa > 4, and Anse + Xnsz 
> 4, we obtain 


> 18 > (31 + 15V5)/4. 


THEOREM 16. We have 


lim /m = (31 + 15V5)/32 > 2.0169. 


It follows from Hurwitz’ Theorem that this lower limit cannot exceed 
V5 for certain irrational numbers &. 


UNIVERSITY OF NORTH CAROLINA. 


BIBLIOGRAPHY. 


[1] E. Borel, “ Contribution a l’analyse arithmétique du continu,” Journal de Mathé- 
matiques pures et appliquées, ser. 5, vol. 9 (1903), pp. 329-375. 

[2] L. R. Ford, “ A geometrical proof of a theorem of Hurwitz,” Proceedings of the 
Edinburgh Mathematical Society, vol. 35 (1916), pp. 59-65. 

[3] M. Fujiwara, “ Bemerkung zur Theorie der Approximation der irrationalen Zahlen 
durch rationale Zahlen,” Science Reports of the Téhoku Imperial Univer- 
sity, ser. 1, vol. 13 (1924), pp. 1-11. 


[4] 


, “Remarks on the theory of approximation of irrational numbers by ; 


t= 
[ 
[| 
[ 
[] 


APPROXIMATION BY CONVERGENTS. 361 


rational numbers,” Japanese Journal of Mathematics, vol. 1 (1924), pp. 
15-16. 
[5] S. Fukasawa (Morimoto), “Uber die Kleinsche geometrische Darstellung des 
Kettensbruchs,” Japaneve Journal of Mathematics, vol. 2 (1926), pp. 
101-114. 
, “ Beweise einiger Saitze in der Kettenbruchstheorie durch die Humbertsche 
geometrische Darstellung,” Japanese Journal of Mathematics, vol. 7 
(1930), pp. 305-314. 
[7] G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers, Oxford, 
Oxford University Press, 2nd edition, 1945. 

[8] P. J. Heawood, “ The classification of rational approximations,” Proceedings of the 
London Mathematical Society, (2), vol. 20 (1922), pp. 233-250. 

[9] G. Humbert, “Remarques sur certaines suites d’approximation,” Journal de 
Mathématiques pures et appliquées, ser. 7, vol. 2 (1916), pp. 155-167. 

[10] A. Hurwitz, “Uber die angenaherte Darstellung, der Irrationalzahlen durch 
rationale Briiche,” Mathematische Annalen, vol. 39 (1891), pp. 279-284, 

[11] J. F. Koksma, Diophantische Approximationen, Berlin, Julius Springer, 1936. 

[12] K. Kurosu, “ Note on the theory of approximation of irrational numbers by 
rational numbers,” T7éhoku Mathematical Journal, vol. 21 (1922), pp. 
247-260. 

[13] O. Perron, Die Lehre von den Kettenbriichen, Leipzig, Teubner, 2nd edition, 1929. 

[14] , “Uber die Approximation irrationaler Zahlen durch rationale,” I, II, 
Sitzungsberichte der Heidelberger Akademie der Wissenschaften, Mathe- 
matisch-naturwissenschaftliche Klasse, Abteilung A, 1921, no. 4 and no. 8. 

[15] K. Shibata, “On the approximation of irrational numbers by rational numbers,” 
Téhoku Mathematical Journal, vol. 23 (1924), pp. 328-337. 

[16] K. T. Vahlen, “ Uber Naherungswerte und Kettenbriiche,” Journal fiir die reine 

und angewandte Mathematik, vol. 115 (1895), pp. 221-233. 


[6] 


1 

d 

é- 
he 

by 


ON LINEAR REPULSIVE FORCES.* 


By AvuREL WINTNER. 


1. The following theorem, and a refinement of it, will be proved: 


In a linear, non-conservative system, of reversible type and of n degrees 
of freedom, let the kinetic energy be determined by n constant, positive mass 
factors, and let the potential energy be a non-positive definite quadratic form 
at every fixed t, with a coefficient matrix which varies with t continuously; 
oo. Then, no matter what the behavior, as t— of this matric 
may be, there exist solutions which are bounded as t—> 0. More specifically, 
n (out of a total of 2n) linearly independent solutions will be of this type. 

The situation is illustrated in the trivial conservative case, in which the 


general solution is of the form 
n 
> (axe** + 
k=1 


where ax, b, are 2n-ary constant vectors, and A;,° - -,An denote non-negative 
exponents which are independent of ¢ and of the integration constants. 
Another illustration is the non-conservative case with a-single degree of 
freedom. Then the theorem reduces to a well-known result of A. Kneser’ 
concerning the scalar differential owe x’ = p(t)xz, where p(t) >0 is 
continuous for 0 =t< o. 

2. Since the kinetic energy is of the form 4% m,a;’?, where every m 
is positive and independent of ¢, there is no loss of generality in assuming 
that every m; is 1. In fact, the substitution = m,2,/? > % 2,’ can be effected 
by a linear transformation of the coordinates 2,,- - -,2n which is independent 
of ¢ and has a non-vanishing determinant; so that, by the inertia theorem, 
the signature of the quadratic form representing the potential energy remains 
unaltered at every 

In the resulting normalization, the Lagrangian equations can be written 


in the form 
(1) x’ —F(t)x=0, 


* Received July 19, 1948. 
1A, Kneser, “ Untersuchung und asymptotische Darstellung der Integrale gewisser 


Differentialgleichungen bei grossen reellen Werthen des Arguments, I,” Journal fiir die 


reine und angewandte Mathematik, vol. 116 (1896), pp. 178-212. 


362 


| 
it 
( 
CO 
a 
a 
th 
eq 
de 
to 


er 


ON LINEAR REPULSIVE FORCES. 363 


where F'(¢) is a real, symmetric, non-negative definite, n-rowed matrix which 
1s a continuous function of and is the coordinate vector 

If a dot denotes scalar multiplication of vectors, the quadratic form 
belonging to the matrix F(t) is x-F(t)z. Since this form is supposed to 
be incapable of negative values, it follows from (1) that the inequality 
a:x’ = 0 holds for every ¢, if x(t) is any solution vector (with real com- 
ponents). On the other hand, 2-2” = (4$2?)” — a”, where y?=y-y=|y|? 
for y= 2,2’. It follows therefore from x-2’ =0 that 


(2) = 0, where r=| | *. 


3. The last inequality means that the squared length of any solution 
vector z(t), when plotted against the t-axis, is everywhere convex toward the 
t-axis. Since the curve r=r(t) must be above or on the f-axis, it follows 
that, if the trivial solution, z(t) =0, of (1) is excluded, the curve r = r(t) 
cannot have more than one of its points on the ¢-axis. Accordingly, if 
0Sa<B< o, a solution vector x(t) satisfies both n-ary conditions 


(3) z(a)—0, 
only if z(t) =0. 


It follows that, if a, b is any pair of points in the 2-space, there exists 
a unique solution vector x = x(t) = xqp(t) satisfying the boundary conditions 


(4) 2(a) =a, 2(B) 


In fact, if 21(t),- --,2°"(t) is any fixed system of linearly independent 
solutions of (1), every solution z(t) determines a unique set of 2n scalar 
integration constants c; satisfying 


(5) v(t) 


conversely, (5) is a solution for every choice of ¢,,:--,Con. Hence, if a 
and b are given, (4) represents 2n linear equations for 2m unknowns «&, 
and so the assertion is equivalent to the statement that the determinant of 
these equations is not 0. But if this determinant were 0, the homogeneous 
equations, those represented by the case (3) of (4), had a solution 
+,Con) (0,---,0). Since such a set of integration constants 
defines a solution (5) distinct from 2(¢) = 0, there results a contradiction 
to the fact that (3) holds only if x(t) =0. 

Clearly, this argument is just the n-dimensional formulation of Jacobi’s 


in 
Y, 
e 
e 
of 
: 
Ig 
t 


364 AUREL WINTNER. 


proof for the non-existence of conjugate points on the geodesics of surfaces 
of non-positive Gaussian curvature. 


4. Choose in (4) 
(6) a= 0, a= e, b=0, 


where m is a positive integer and e a point of the unit sphere | z|—1. 
For an arbirarily fixed e, let ™x(t), where 0 = t¢ < o, denote the solution 
a(t) determined by (6) and (4). Then, as shown above, none of the 
functions 


(7) Tm = tm(t) | x(t)| 


can have two zeros. On the other hand, each of the functions (7) has a 
zero; in fact, by (6) and (4), 


(8) tm(0) =1, T'm(m) = 0. 


But (8) and the convexity condition (2) imply that rm(t) is monotone for 
0=t<m. Accordingly, 


(9) Tm(t) > 0, rm(t) S0, r’n(t) 2O if OSt<m. 


The first of the conditions (8) and the second of the inequalities (9) 
imply that the functions rm(¢) are uniformly bounded on every fixed, bounded 
t-interval. In view of (7), the same holds for the vectors "x(t) and s0, 
by (1), for the vectors ™a’’(t) as well, the elements of the matrix F(t) 
being continuous, hence locally bounded. On the other hand, the uniform 
boundedness of ™2”’(t) for 0<¢< 1 implies, by Taylor’s theorem, the 
boundedness of ™2(1) —™2(0) —™2’(0) as m—> Since both ™x(1) and 
™z(0) are bounded, this means that ™r’(0) is bounded, as m—> oo. Hence, 
a quadrature shows that the uniform boundedness of ™a’’(t) on every fixed, 
bounded ¢-interval implies the same for ™2’(t). 

Accordingly, both sequences *x(t),---, 1a’(t), 2a’(t),- are 
uniformly bounded, and have uniformly bounded derivatives, and are there- 
fore equicontinuous, on every fixed, bounded ¢-interval. Consequently, the 
sequence 1,2,- - - contains a subsequence having the property that, if the 
m-th element of the subsequence is denoted simply by m, then ™a(t) and 
my’(t) tend, as m—> oo, to certain functions, z(t) and 2’(t), uniformly on 
every fixed, bounded interval contained in the half-line 0 =¢t < o, and the 
limit 2’(t) is the derivative of the limit x(t). 


a 
a 


| 


ON LINEAR REPULSIVE FORCES. 365 


5. Since every "z(¢) is a solution of (1), it is clear that the limit 
function z(t), just obtained, is a solution of (1), and that, corresponding 
to (9) and (7), 


(10) r(t)>0, r(t) <0, for0St<~, 


where =| a(t)|*. Hence, the theorem italicized above can be refined 
as follows: 


If F(t), where OSt< o, is an n-rowed, real, symmetric matrix of 
continuous functions which is non-negative definite at every fixed t, then the 
system (1) has n linearly independent solution vectors x(t) which are 
“asymptotic solutions,’ in the sense that r(t) =| a(t)|* satisfies (10); 
so that, in particular, 

lim exists co) and lim r’(t) =0. 

In fact, since every ™x(0) was chosen to be e, an arbitrary vector of 
length 1, and since "x(t) > a(t) as m—> o holds for every ¢ and so for 
t = 0, the initial vector 7/0) of the solution occurring in (10) can be chosen 
to be any vector e of length 1. Hence, there are admitted n linearly inde- 
pendent initial vectors x(0). But then, no matter what the initial vectors 
a’(0) may be, there must result n linearly independent solution vectors x(t). 


6. If A is any matrix, let A* denote the transposed matrix. 


The preceding theorem remains true if F(t) in (1) is replaced by 
F(t) + S(t), where S(t) is any matrix which is skew-symmetric and con- 
tinuous at every t. 

In particular, instead of the assumption that F(t) in (1) is symmetric 
and non-negative definite, it is sufficient to assume that F(t) in (1) is such 
as to make the matrix F(t) + F*(t) non-negative definite. 


The second of these assertions follows from the first, since F + S becomes 
+ F*) if S=4(F*—F). 
In order to verify the first assertion, let (1) be replaced by 


a’ (F+8)r=0. 
Then, since x: Sz 0 is an identity for every skew-symmetric 8, 
== Fz, 


This leads to (2), as above, and the balance of the proof remains unchanged. 
Interesting is the particular case x” = S(t)x, where S(t) is any real, 


. 
. 
) 
| 


366 AUREL WINTNER. 


skew-symmetric matrix of continuous functions. This case is covered by the 
above results, since the zero matrix, being non-negative definite, can be 
chosen to be F(t) in F(t) + S(#). 


7. The result concerning the number of “asymptotic” (or, for that 
matter, just bounded) linearly independent solutions cannot be improved. 
In fact, under any of the above assumptions, there exist n (out of 2n) 
linearly independent solutions x(t) satisfying | x(t)|—> 0 as 

First, it is clear that there exist n linearly independent 2n-ary vectors 


V;," * *,Un having the following property: If v= (¢,,---+,¢n) is any of 
the 2n-ary vectors 11, -,Un, then a:b >0 holds for the pair of n-ary 
vectors defined by a= +, ¢n), = * *5 Con). 


Corresponding to any such a, b, let x(t) denote the solution defined 
by the initial conditions 7(0) =a, 2’(0) =b. Then r’(0) = 2a-), since 
r(t) =a(t)-a2(t). Consequently, 7’(¢) >0 holds at and therefore 
at every ¢ close enough to {= 0. It follows therefore from (2) that, for 
reasons of convexity, r(t) > «© as to. 


THE JOHNS HOPKINS UNIVERSITY. 


t 


Sl 


cf 


( 

( 

( 

( 
= 
be 


ON THE LAPLACE-FOURIER TRANSCENDENTS.* 


By PHILip HarTMAN and AUREL WINTNER. 


The present paper deals with extensions of the results proved on 
pp. 87-96 of vol. 69 (1947) of this Journal. The nature of the extensions 
in question can be illustrated by the following theorem, proved loc. cit. only 
for the particular case a) =a, 

If Qn, bn are real, non-negative numbers corresponding to which the 
coefficient functions of the differential equation 


(1) + 2! (—1)"an/t2 + = 0 
n=0 n=0 


are entire functions in 1/t, then (1) has on the half-line 


(2) ow 


a solution of the form 
(3) z(t) = ( f )e-ttuda(u) 0, 


where a(w) 1s monotone (but not necessarily bounded) for 0O=u< o, and 


the convergence of (3) on (2) ts part of the statement. 
In (3), the parenthesis of the integral sign refers to the Abelian 


evaluation of 
fe @) 


f e-ttuda(u), 8 lim 
€>+0 
0 
If & is a real index, not necessarily an integer, then Bessel’s equation, 
ta” +. ta’ + (t? =0, 
supplies an illustration of the theorem, since the coefficients a, 6, in (1) 
become 
= 1, =.=: bo = 1, b, = k?, bo = b, =: -=—0; 
cf. loc. cit., p. 89 (the footnote on p. 89 is relevant in the present case also). 
If ¢ is replaced by ti, then (1) goes over into 


(4) xv’ + r(t)a’ —q(t)x =0, 


* Received September 1, 1948. 
367 


368 PHILIP HARTMAN AND AUREL WINTNER. 


where 


co ee) 
(5) r(t) =San/t2"", g(t) = 3d, 
n=0 n=0 


If m= 0,1, 2,- -, then (—1)™ times the m-th derivative of either of the 
functions (5) is non-negative on the half-line (2), since = 0, bx = 0. In 
other words, both functions (5) are completely monotone on (2). It will be 
shown that this alone implies for (4) the existence of a solution of the form 


co 
(6) x(t) = f e-*da(u) ~0 
0 

on (2), where a(w) is monotone for 0=u< o. For reasons of analyticity, 
(6) must be a solution of (4) when ¢, instead of being positive, is complex 
with a positive real part. In particular, (6) will represent a solution of 
(4) on the line «+ it, where « >0 is fixed and ¢ is a real variable. Hence, 
the existence of a solution (3) of (1) follows by letting «— 0. 

The proof for the existence of a solution (6) of (4) will be based on 
the following 


LEMMA. On an interval 
(7) OSt<T, 


where the case of a half-line (T = &) is allowed, let p(t), r(t), q(t) be 
real-valued, continuous functions, the first of which 1s positive, the second 
arbitrary, the third non-negative. Then the differential equation 


(8) p(t)” + 


has a solution which is positive and non-increasing on (7). 
The same is true tf (8) ts replaced by 


(9) (P(t)2’)’ —Q(t)« =0, 
where P(t) >0 and Q(t) 20 are continuous on (7%). 


Remark. The assertions of the Lemma remain true if (7) is replaced 
by the open interval O<t<T (where, as before, T= 1s allowed). 
It will be noted that (9) becomes a particular case, 
P, r= Q, 


of (8) only if P, instead of being just continuous, is required to have 1 
continuous derivative. 


ON THE LAPLACE-FOURIER TRANSCENDENTS. 


If (8) reduces to 
(10) x’ —q(t)x=0, where q(t) 2 0, 


the assertion of the Lemma becomes identical with the result of A. Kneser 
(1896), used loc. cit. 
In the case of (9), replace ¢ by s = s(t), where 


s dv/P(v). 


Then s is an increasing function of t, and (9) appears in the form (10), 

if the independent variable of (10) is s, and q denotes the product PQ (as a 

function of s). Since PQ = 0, the assumption, g = 0, of (10) is satisfied. 
In the case of (8), put 


P(t) f r(v)dv/p(v) 


and multiply (8) by P(t)/p(t) (which is positive throughout). Then (8) 
appears in the form (9), where Q(t) =q(t)P(t)/p(t) (hence Q(t) > 0). 
This proves the Lemma. The Remark following the Lemma results if 
(7) is replaced by the interval «= ¢ < T, where « > 0, and, after the Lemma 
has been applied on the latter interval, « tends to 0. 
By the procedure applied loc. cit., there will now be proved the main 


THEOREM. Let two functions, q(t) and r(t), and the first derivative, 
p(t), of a third positive function, p(t), be completely monotone on the half- 
line (2). Then the differential equation (8) has a solution x(t) 40 which 
ts completely monotone on (2), i.e., which is representable on (2) in the 
form (6) (Hausdorff-Bernstein), where a(w) is non-decreasing, but not 
necessarily bounded, for OS u< o. 


Corottary. Jf p(t) is positive on (2) and if q(t) and the first 
derivative of p(t) are completely monotone on (2), then the self-adjoint 
differential equation 


(p(t)2’)’—q(t)x =0 


has a solution x(t) 0 which is completely monotone on (2). 

In fact, the assumptions of the Corollary imply those of the Theorem, 
r(t) being p’(t). 
In the particular case (10) of (8), the assertion of the Theorem was 


369 

0 
t 
0 
d | 


370 PHILIP HARTMAN AND AUREL WINTNER. 


proved loc. cit. Although (8) is a normal form of (10), the reduction of 
the assertion of the Theorem to this normal form leads to difficulties. For 
this reason, the following proof for the general case will proceed directly. 
First, since p > 0 and q= 0 are continuous on (2), the Remark following 
the Lemma assures for (8) the existence of a solution z(t) 0 satisfying 


(11) (— 1)"d"r/dit"= 0 for n=0 and n =1, where 0 <t< 
Next, if a = —t, then (2) goes over into — 0 <s <0, and (8) into 
(12) P(s)d*x/ds? = R(s)dx/ds + Q(s)a, 
where 
(13) P(s)=p(—s), R(s)=r(—s), Q(s) = G(s). 


Since p(t) is positive, and dp(t)/dt, r(t), q(t) are completely mono- 
tone, on (2), it is clear from (13) that P(s) is positive, that each but the 
0-th derivative of P(s) is non-positive, and that each (including the 0-th) 
derivative of R(s) and Q(s) is non-negative (if the derivatives are taken 
with respect to s, where — 0 <s<0). Hence, if (12) is differentiated n 
times with respect to s, it is seen that the product Pd"**x/ds"** is a linear 


form, with non-negative coefficients, in 
(14) x, dz/ds, d*x/ds?,- - -,d"x/ds", 


Since P > 0, it follows that d"*?z/ds"*? is non-negative if each of the n +2 


functions (14) is non-negative. Consequently, 


(15) d"x/ds" = 0, where — 0 <s < 0, 
holds for every n, if (15) holds for n —0 and for n —1. 

Since ¢ —— s, the assertion of (11) is that (15) holds for n —0 and 
for n=1. Hence, (15) holds for every n>0. In view of t=—s, this 
means that 
(16) (—1)*"d"z/dt" = 0, where t< 


holds for every n > 0. 

According to (16), the solution z(t) of (8) is completely monotone on 
(2). Hence, in order to complete the proof of the Theorem, it is sufficient [ 
to ascertain that z(t) >0. But this follows from the choice of z(t), [ 
according to which x(t) is positive (at every > 0). 


| 
( 
n 
0 
fi 
of 
Pr 
fu 
(2 


ON THE LAPLACE-FOURIER TRANSCENDENTS. 


APPENDIX. 


As emphasized above, the non-decreasing function (uw) + const. occurring 
in (6) is not in general bounded as u->o. A by-product of the following 
considerations will be a characterization of the coefficient functions p(t), 
r(t), g(t) of those differential equations (8) corresponding to which the 
function «(w) occurring in the solution (6) is bounded as u—> o. 

First, since «(u) is non-decreasing, it is clear from (6) that a(w) is 
bounded as u-> © if and only if x(t) is bounded as ¢->-+0. But the 
boundedness of x(¢) near ¢ 0, being an issue of purely local nature, has 
nothing to do with the question of complete monotony. Hence, it is sufficient 
to answer the question for the case in which (8) is given in its normal form 
(10). The answer then runs as follows: 


Let q(t) be real-valued, continuous and non-negative for 0< tS 1. 
Then, according to the Lemma, 


(17) v’ —q(t)t=0 
has a solution satisfying 
(18) a(t) >0 and a(t) =0 for 0 < 


In order that this solution be bounded (as t—>-+ 0), the condition 


1 
(19) f tq(t)dti< 
+0 
is necessary and sufficient. 


The sufficiency of (19) is quite on the surface, in the sense that it has 
nothing to do with the assumption q(t) 20. In fact, if q(t), where 
0<t=1, is any continuous function, then 


(19 bis) f t| q(t)| @ 
+0 


is sufficient in order that every solution z(t) of (17) be such as to have a 
finite limit z(+ 0). This is contained in Bécher’s extension of the theory 
of Fuchsian points to the real field (Trans. Amer. Math. Soc., vol. 1 (1900), 
pp. 40-52, § 4). 

The proof of the converse depends on the consideration of the Lagrangian 
function, say 4Z, of (17) along an arbitrary solution z(t) of (17); so that 
L=L(t) and 
(20) L(t) = + q(t)2?(t). 


371 

yf 
8 
0 
)- 
) 

| 

1 

| | 


372 PHILIP HARTMAN AND AUREL WINTNER. 


Even if g=0 is not assumed, 
1 


(21) tx(t)a’(t) = const. + $27(t) — f sL(s)ds 
t 
holds for0 <<t=1. In fact, it is easily verified by differentiation that (21) 
is an identity in ¢ by virtue of (17) and (20). 
It will be concluded from (21) that, if z(t) is a bounded solution 
satisfying (18) and if g=0, then the improper integral 


1 
(22) f tL (t)dt is convergent. 


+0 


In view of (20), where g = 0, this will imply that 


f tq(t)a2(t)dt < 0. 


Hence, (19) will follow from the fact that, according to (18), the function 
x(t) must keep away from 0 as t—>-+ 0. 

Actually, (18) and the boundedness of 2(¢) assure the existence of 4 
positive limit z(+ 0) 0. Hence, 


‘1 
a?(1) —2?(+ do*(t) =2 
+0 +0 
Suppose that x(t)2’(¢) is monotone. Then the convergence of the last 
integral implies that 
tx(t)a’(t) ~0 as + 0. 


This limit relation, when combined with (21), where 2?(-++ 0) is supposed to 
exist (+4 co), contains the truth of (22). Hence, the proof will be complete 
if it is ascertained that z(¢)a’(¢) is monotone. 

To this end, the assumption g=0 will be used again. It implies, by 
(17), that zz” =0. Since (22’)’ + 2”, it follows that (x2’)’ 20. 
Hence, x(t)2’(t) is non-decreasing. 


JOHNS HOPKINS UNIVERSITY. 


| 
el 
1 al 
+0 

(1 
th 
(2 
(3 
fo 
wh 
in 
coe 
mo 
Th 
me 

equ 


FURTHER CONGRUENCE PROPERTIES OF THE FOURIER 
COEFFICIENTS OF THE MODULAR INVARIANT j(r).* 


By JosepH LEHNER. 


This article is a continuation of a previous one [2], in which it was 
shown that the Fourier coefficients, cn, of the modular invariant j(r), 


i(r) 744+ 196, 884¢-+---, x= exp 2mir, I(r) > 0, 
p 


enjoy certain congruence properties with respect to the moduli 5, 7 and their 
powers, and 11, 11°. In this paper we shall consider the moduli 2%, 3%, ¢ = 1, 
and prove 


THEOREM 1. If 


(1) +eot+ Cn2”, == exp 
then 

(2) Cn = 0 (mod 2548) if (mod 2°) 

(3) Cn = 0 (mod 3748) if (mod 3°) 


for 
The method of proof is that used in [2]. The congruences follow from 
identities, in which the left members are series 
Cnpt*, n = 0(mod p*), p = 2 or 3 
while the right members are polynomials, with rational integral coefficients, 
in certain functions , which themselves have series expansions with integral 
coefficients. 

The proof is in two parts. The first part consists in showing that for 
%=1 (in Theorem 1), both members of the identity described above are 
modular functions belonging to a certain subgroup of the modular group. 
The identity is then established by matching the principal parts of the two 
members at the parabolic vertices of the subgroup. 1 is concerned with this. 

The second part of the proof proceeds by induction on « Here the 
modular equation for the function ® plays an essential role. The modular 
equation is an algebraic equation connecting ®(7) and @(pr). For p=5,7, 


* Received September 3, 1948. 


373 


374 JOSEPH LEHNER. 


these equations were worked out by Rademacher ([3], pp. 626, 628). It 
seems worth while to give here a systematic method which is applicable to 
all primes of interest, viz., p= 2,3,5,7,13. This is done in 2 and the 
modular equations given. As will appear in the course of the argument, the 
above primes are those for which the subgroup Io(p) is of genus zero, 
(T.(p) is defined by the condition p|c in the modular substitution 
7’ =(ar+b)/(cr+d).) The induction on @ is carried out in 3 and 
completes the proof of Theorem 1. 

In 5 we consider modular functions other than j(r) which share with it 
the congruence properties of Theorem 1 and of Theorems 1 and 2 of [2]. 


1. Congruences for the moduli 2 and 3. We are going to follow 
closely the reasoning in [2]. We recall that F(7) is a modular function on 
a group I if 


1) it is a meromorphic function of 7 in the half-plane J(r) > 0; 


2) it possesses only poles in the local variable at the parabolic vertices 
of T; 


3) it is invariant: F(Vr) =F (r), for every modular substitution 
Vr = (ar+b)/(er +d) of I. 


In what follows, ! may be either the modular group or one- of its subgroups 
of finite index. T,(p) is the subgroup given by c=0 (mod p). p is one of 
the primes 2, 3, 5, 7, 13. 

We introduce the linear operator 


where f(r) is given by the expansion 


f(r) 2 
and throughout this paper 
(1. 2) = exp I(r) >0. 


In [2], 2 it is shown that 


(1.3) U,f(r) t= —[s/p]. 


Moreover, it is proved that if f(r) is a modular function belonging to the 


f 
W. 
( 
Ww 
U 
th 
( 
( 
a 
mé 
pr 
wl 
of 
» 
an 
p-1 
tia 
A=0 for 
as 
Fr 
(1 
| its 


CONGRUENCE PROPERTIES OF FOURIER COEFFICIENTS. 375 


full modular group, then U,f belongs to the subgroup T,(p). (The proof 
was carried out for the function j(7) but is easily seen to be valid for any 
f(r) on the full group.) 

A formula which will be useful later is (8.81) of [1]: 


(1. 4) pU pF (— 1/pr) — pUpF (pr) = F(—1/p*r) — F(r), 
where F(r) is an entire modular function of Iy(p). By U,;F (pr) we mean 
U,F(r) with + replaced by pr; similarly for Up¥(—1/pr). 

Since the subgroups I',)(p) are of genus zero for the primes of interest, 
they will possess univalent functions, which may be taken as 


with 

(1.6) n(r) exp (wir/12) I] (1—a™), 

and 

(1.7) r(p—1) = 24. 


That the functions ® belong to T)(p) has been demonstrated by Rade- 
macher for p > 3 ([3], Theorem 1, p. 619) if we keep in mind that, for the 
primes under consideration in this paper, r is always even. But his proofs, 
with minor modifications, are readily seen to hold for? p= 2, 3. 

The functions ® are clearly univalent in the fundamental region of 
T\(p), as is evident from the following facts: 1) the fundamental region 
of T,)(p) has two parabolic points, say r= 10,0; 2) ® has a simple zero at 
t=100 and a simple pole at r = 0 measured in 2 = exp 2zir; 3) ® is regular 
and zero free in the interior of the fundamental region. These considera- 
tions are developed in detail in [2] for p>3 (2) but are equally valid 
for pS 3. 

With these preliminaries out of the way, we are ready to express Upj(r) 
as a polynomial in @(r). For this purpose we need consider the vertex at 
t=0 only, since U,j and ® are both regular at r—ioo ([2], (2.3)). 
From [2], (2.4), (2.6) we take the expansions, valid for p< 3, 


(1. 8) = 
(1.9) 7’ = —1/pr, 


*When p=2, r= 24, and {n(2r) /n(r) }** =A(2r)/A(r), where A is the 
“discriminant ” of elliptic function theory. A is a modular form of dimension — 12; 
its transformation equation is A(Vr) = (er+d)**A(r). It follows easily that 
A(2r)/A(r) is invariant on T,(2). 


376 JOSEPH LEHNER. 


the coefficients in the right members being integers. From a comparison 
of the principal parts it follows that U,j is equal to a polynomial in ® whose 


coefficients, except possibly for the constant term, are integers divisible by 


(1. 10) Upj(r) = Cy + pr Capt 


with integral C,,C.,-- - 
Since for p= 2, r—=24; for r—12; Theorem 1 for a=] 
follows at once from (1.10) and (1.4). 


2. Modular equations. In order to generalize the congruences just 
proved to powers of 2 and 3 as moduli, it will be necessary to introduce the 
modular equation of @. The required results are contained in 


THEOREM 2. Let the genus of T,(p) be zero. Put 
Z=,,(r), W = 


Then 
(2.1) = pd 
Moreover, the modular equation connecting Z and W is 
(2. 2) We +S (—1)ip, Wri =0, 
j=l 
where 
Dp 
(2. 3) (— 1) = 


Here, the b; are integers with one exception: for p= 13, 13b, is an integer. 


Proof. The proof follows a method sketched in a previous paper ([1], 
p. 511). Applying (1.4) with F —,,(r) =Z, we have 


(2. 4) pU (— 1/pr) — pU (pr) = Z(— 1/p*r) — Z(r). 
Now assuming the truth of (2.1) for the moment, we get 

(2.5) pUpZ(— 1/pr) = p* 2 1/pr) 

j= 


= p? dip 
1 


W 
( 
a 
( 
a@ 
( 
a 
b 
( 
( 
W 
Ci 
il 
L 
( 
a 
i, 
0: 
Z 


CONGRUENCE PROPERTIES OF FOURIER COEFFICIENTS. 


where we have used the property 


(2. 6) Z(—1/pr) = 
as given in [1], (8.83). Also 

(2.1) pU (pr) — p* (pr), 
and 

(2.8) Z(—1/p*r) = (pr), 


as follows from (2.1) and (2.6) on replacing r by pr. 
Substituting (2.5), (2.7), and (2.8) in (2.4) gives, after replacing + 


by t/D, 
(2.9) prise + W—Z4 = 0, 
j=1 


an equation whose left member obviously has the factor W--—Z. If we 
exclude this factor, multiply by W?-, and rearrange, we obtain 


(2.10) We — by — 0, 
j=l k=0 


which, with a change of summation variables, is exactly (2.2) and (2.3). 

The remainder of the proof is taken up with establishing (2.1) and 
calculating the b;. First we show that U,Z is a function of T,)(p). Its 
invariance under the substitutions Vr of: this subgroup follows from [1], 
Lemma 1, p. 501: 


Next, Z vanishes at +r = ico ; therefore, UpZ does also. Finally, at r’ 0, we 
apply (2.4), (2.8) and have 


(2.12) pUpZ(—1/pr) = (pr) + (pr) —Z(r) 
prea? + O(1), 


ie., Up has a pole of order p in « at r’ =0. 


Since Z has a pole of first order at zero while UyZ has a pole of order p 
there, it is clear from familiar arguments that U»Z is a polynomial in Z 
of degree p. This polynomial will not have a constant term, since U,Z and 
Z both vanish at infinity. 


9 


377 
| 


378 JOSEPH LEHNER. 


Write 
(2. 13) Z(r)i +--+), 


and note that, by an elementary theorem on binomial coefficients, 
(2. 14) p | 


Then, using (2.12) and (2.6), we have the following equations which secure 
agreement of the principal parts of the two members of (2.1) at r’—=0 and 
thus serve to determine the coefficients };: 


(2.15) 0 = bpp + bp, /2 


0 = bpp ?) + bp spt PY + 


By solving the system (2.15) recursively and remembering (2.14) one 
finds easily that 


(2. 16) 
| bj, j=1, 


Hence, 0; is always an integer when j > 1, since r= 2; and pb, is an integer. 
Actually, it will turn out in the calculations that 6, itself is integral except 
when p = 13. 

It remains to compute the bj. We list below for each value of p the 
expansion of p’/*Z(—1/pr) to p terms, from which the needed powers of Z 
are readily obtained. From these and (2.12), the b; are calculated com- 
paratively easily. The 0; are also listed. 

The b; shown in the table agree, for p—5,%, with the coefficients of 
Rademacher’s modular equations ([3], pp. 626, 628). However, we have 
used a slightly different notation. If, in our equation (2.9), r is replaced 
by 57, then the equation is multiplied by ZW-*, it becomes Rademacher’ 
equation following (11.4), with the identifications 


X=W-(5r), Y—=Z(5r), 5D; Bj. 


For p=7%, our formula (2.9) is directly comparable with Rademacher’s 


equation (12.5). 


b 
( 
| 


CONGRUENCE PROPERTIES OF FOURIER COEFFICIENTS. 
VALUES OF b; IN (2.1) 
p 
2 3 5 7 13 
j 
1 3.2? 10.3 63 82 1,165.13-2 
2 gue 4.3° 52.58 176.72 9,604 
3 gre 63.55 845.78 27,272.13 
4 6.58 272.78 41,140.13? 
5 510 46.77 3,014.134 
| 6 4.7° 25,660.134 
7 12,086.13° 
8 4,180.13° 
9 1,064.13? 
10 196.138 
11 25.13° 
12 2.13%° 
13 132° 


EXPANSION OF p’/?Z(—1/pr) IN POWERS OF x = exp 2mir 


Power of 
—l 0 1 2 3 4 5 6 7 8 9 10 11 
1 —24 
3 1 —12 54 
5 1 —6 9 10 —30 
7 1 —4 2 8 —5 —4 —10 
13 1 —2 —1 2 1 2 0o—2 —2 1 0 O 


Remarks. (1) If only the modular equation (2.2) is wanted, one can 
proceed more easily as follows. j(7) is a rational function of Z of the form 


(2.17) j(r) + 


the integers a, being determined by the expansion of the two members of the 
equation at zero. Now using the invariance of j under the substitution 
Tr = —1/r, we obtain 


0 0 


where we have employed the relation 


Z(—1/r) W>(r), 


379 


380 JOSEPH LEHNER. 


obtained from (2.6). Equation (2.18) is of the same form as (2.9), from 
which the modular equation follows. 


(2) The foregoing discussion has been carried out for a particular 
modular function of T)(p). However, any univalent modular function, 
Wv, of T,(p), possesses a modular equation. This equation may be obtained 
easily from (2. 2) by using the relation between ¥ and Z, which, in accordance 
with the univalence of both functions must be of the form 


Z= (ab + B)/(y¥ +8), a5 — By 0. 


W is, of course, the same function of p'/?8(7/p). 


8. Congruences for powers of 2 and 3. In order to prove Theorem 1 
for « >1, we shall have to iterate equation (1.10), as was done in 5-6 of 


[2]. We treat the case p = 2 first. 
In order to have (1.10) in a form suitable for iteration, we rewrite it as 


4 
(3. 1) By + > pk By + 2UR, 
A=1 


where F# is a polynomial of the form ? 
(3. 2) R= d,® + +- - +4 (t{=1) 


with integral d. Applying the operator U, to both sides we obtain 
4 

(3. 3) By > B, + Qit UR. 
1 


In the above equations, the B’s are integers. U,? is the iterate of U2: 


U27j(r) = U2{U2j (7) }. 
We wish to have the right member of (3.3) in the same form as that 
of (3.1) but with a higher power of 2 multiplying the sum. We shall, in 


fact, prove 
(3. 4) 28(h-1) [7 hh 25R, h == 1, 2,- 


Theorem 1 will follow without difficulty from (3.4). 
The proof of (3.4) is by induction. First, note that 


p-1 p-1 


* R is not necessarily the same polynomial at each appearance. 


1 

a 

( 

( 
F 
(a 
for 
(3 
He 


CONGRUENCE PROPERTIES OF FOURIER COEFFICIENTS. 381 
where W), = p’/*((r-+A)/p) are the conjugates of the algebraic (modular) 
equation (2.2). Thus with 

(3.4) is equivalent to ; 
since r = 24 when p= 2. 


The proof makes use of Newton’s formulae for the sum of powers of 
the conjugates of an algebraic equation: * 


h 
(3. 8) (— h=1,2,-°-, 
j-1 


with the conventions 
pj =0 for 7 >p, 


So =h, 


where the algebraic equation is written as in (2.2). We now calculate 8, 


and So. 
From (3.8) we have 


(3. 9) S; pi 

(3. 10) So = — = pi? — 2pro. 

From (2.3) we get 

(3. 11) Pr = + 28H?) — 27°R 

(3. 12) Po = — 2*4R, 

where we have used the values of b; given in the table toward the end of 2. 


The polynomials FR, 


t 
(3. 13) R, = => 


j=1 


form a ring. In terms of R, we can rewrite (3.11), (3.12) as 


(3. 14) P1 2°R,, P2 


Hence, from (3.9), 


* The above device was first used in this connection by G. N. Watson [4]. 


= 


382 JOSEPH LEHNER. 


(3. 15) S, = 2°R, 
S. (28R,)? + 2-2°R, — 2'°R,? + 
216R, + 217R, — 


satisfying (3.7) for h 1, 2. 
Now let & > 2 and assume that (3.7) is fulfilled for h < k: 


(3. 16) Sp == 2441) h =1,2,---,k—1. 
(3.8) gives 

Sk = pSx-1 — poSk-2 (k > 2) 


== — Q4k+12 Pp 


which, together with (3.15), completes the proof of (3.7), i.e. of (3.4). 


A simple consequence of (3.4) is 
(3. 18) = 2°R. 
We can now apply (3.18) to (3.3) and obtain 
(3.19) U2?) = By + 24R. 
By successive iteration we have 
(3. 20) = By + m = 1,2,3,-->. 


Since, however, we also have 
(3. 21) =D g—2" 
u=0 
by iteration of (1.3), the truth of Theorem 1 follows at once for all positive 
integral « and p= 2. 
The development for the case p = 3 is parallel. We write (1.10) as 


(3. 22) Usj = By + 3°T, 


where T is of the form 


t 


l=1 


| 
| 
W 
( 
| fo 
(3 
oth 
The 
as Tr 
the 
pute 


CONGRUENCE PROPERTIES OF FOURIER COEFFICIENTS. 383 


and the d; are integers. We prove 


(3. 24) 340-1) — h=1,2,-°°, 


what is equivalent 
(3. 25) Sy = 
From 2 we find without difficulty 
p= 3°T = 
(3. 26) == BMT 
ps = 3°T 3™T,, 
where the polynomials 7, 
(3. 27) T, = 3'T 


form a ring. From (3.8), we have 
(3.28) = pi? — 2po, S; = — 3pip2 + 3psz. 
Substituting (3.26) in (3.28), we obtain 
S, = 3°T, | 
(3.29) (3°71)? + 2-37, = 37, 
Ss = (3°71) + + 3-347, = 387. 


Thus, (3. 25) is proved for h = 1, 2, 3. 
Now let & > 3, and let (3.25) be true for kh [=k —1, 
(3. 30) Sy h=1,2,---,k—1. 
Then, from (3.8), (3.26), and (3.30), we get 
= P2Sx-2 + PsSx-s (k > 3) 


as required for the proof of (3.25). Theorem 1 for p—<3 now follows ‘in 
the same way as in the case p= 2. 


4, Comparison with numerical results. H. S. Zuckerman has com- 
puted the coefficients c, for n < 24 ([5]). The table which follows shows 
the factorization of the c, with respect to primes < 13. 


| 
| 


384 JOSEPH LEHNER. 


HIGHEST POWER OF p DIVIDING ¢n 


Ne 2 3 5 7 1 
n 
1 2 3 0 0 0 
2 il 0 1 0 0 
3 1 5 1 0 0 
4 14 3 0 0 0 
5 3 0 2 0 0 
6 13 6 0 0 1 
7 0 3 1 1 0 
8 17 1 3 0 0 
9 2 7 1 0 0 
10 12 5 3 0 0 
11 1 1 0 0 1 
12 16 5 1 0 0 
13 3 3 1 0 0 
14 14 0 0 a 0 
15 0 7 2 1 0 
16 20 4 1 0 0 
17 3 2 1 0 0 
18 1 8 1 0 0 
19 1 3 3 0 0 
20 15 0 2 0 0 
21 7 5 0 2 0 
22 13 5 1 0 1 
23 2 1 1 0 0 
24 19 6 0 1 0 


This table serves to verify Theorem 1 and Theorems 1 and 2 of [2] 
for n= 24. Note that these theorems predict the exact power of the prime 


dividing cn for n—p and n= p* (p< 13). 


5. Modular functions with congruence properties. The preceding dis- 
cussion does not by any means exhaust the possibilities as regards congruence 
relations for Fourier coefficients of modular functions. Extensions are possible 
both to other moduli and modular functions other than j(r). We close with 
an example of a theorem of the latter type. 


TueorEM 3. Let F(r) be an entire modular function of To(p), with 


p11, having the expansions * 


(5.1) F(r) ana, 


*F(r) necessarily has expansions in @ at both parabolic vertices of Ty. 


( 
= 
( 
re 
| 
| ar 
Wi 
wh 
* 
bot 
5, 


CONGRUENCE PROPERTIES OF FOURIER COEFFICIENTS. 


(5. 2) F(— 1/pr) + b_.27 
where *,@-s, are integers, and 
S< p. 


Then the Fourier coefficients an, with n= 1, have the congruence properties 
of Theorem 1 of this paper and Theorems 1 and 2 of [2], viz., if n=0 
(mod 293°5°7411°), then an==0 (mod for n,a, b,c, d 
= 1,2,3,---; and e=1,2. 


Obviously 7(7) satisfies the hypotheses for all p. 


Proof. By (5.1) and (1.3) we see that U,F(r) is regular at infinity, 
since s << p: 


(5. 4) U,F (+) =a + apt 


At zero, we use (1.4), (5.1), (5.4), and (5.2), the last two with r replaced 
by pr, and find 


(5. 5) pU,F (—1/pr) = O(1) + +b 


At all interior points of the upper half-plane U,F'(r) is regular. Moreover, 
it remains invariant under the substitutions of Ty, as may be checked by 
reference to Lemma 1 of [1] (p. 501). Therefore, U,F(r) is an entire 
modular function on the subgroup Ty(p). 

We now separate the cases p17, p11, treating the former first. 
By fitting a polynomial in 4), to UpF at r’ —0, we find in the usual way 
an identity of the form 


J 
(+) = const. -+ 
j=1 
with integral dj, i.e., 


(5. 6) U,F(r) = const. + 


where V is a polynomial of type R (p—=2), T (p—=3)—as in 8 of this 
paper—or Q (p= 5,7%)—cf. 5 of [2]. No difficulty arises at 100 since 
both members of (5.6) are regular there. The iteration of (5.6) under the 
operator Up, now proceeds in exactly the same way as in 3 of this paper and 
5, 6 of [2] and establishes the congruences of the theorem when p <7. 

For p= 11 we apply the methods of [2], 3, 4 and obtain identities with 


385 


386 JOSEPH LEHNER. 


integer coefficients for in terms of the functions A,(7), C,,(r), 
which themselves have integral coefficient expansions. The congruences for 
p= 11, 11° are then immediate consequences. 


Added in proof. One additional value of c, appears in the literature, 
namely C25, on p. 491 of D. H. Lehmer’s paper, “ Properties of the coefficients 
of the modular invariant J(r),” American Journal of Mathematics, vol. 64 
(1942), pp. 488-502. c.; has the factorization 2? - 3- 5°- A, where p < 13 does 
not divide A. This value further illustrates the remark at the close of 4. 


Theorem 3 can be extended to the case e—3. See addendum at the 
end of [2]. 


New Clty. 


BIBLIOGRAPHY. 


[1] J. Lehner, “ Ramanujan identities involving the partition function for the moduli 
11¢,” American Journal of Mathematics, vol. 65 (1943), pp. 492-520. 
, “ Divisibility properties of the Fourier coefficients of the modular invariant 


[2] 


j(r),” American Journal of Mathematics, vol. 71 (1949), pp. 136-148. 

[3] H. Rademacher, “ The Ramanujan identities under modular substitutions,” Trans- 
actions of the American Mathematical Society, vol. 51 (1942), pp. 609-636. 

[4] G. N. Watson, “ Ramanujans Vermutung iiber Zerfallungsanzahlen,” Journal fiir die 
reine und angewandte Mathematik, vol. 179 (1938), pp. 97-128. 

[5] H. S. Zuckerman, “ The computation of the smaller coefficients of J(r),” Bulletin 
of the American Mathematical Society, vol. 45 (1939), pp. 917-919. 


fe 


| 
t 
T 
n 
n 
| 
| 
| 
q 
| 
| h 
W 
| 


DIOPHANTINE EQUATIONS IN ALGEBRAIC NUMBER FIELDS.* 
By L. G. Prcx. 


Consider a system of algebraic equations, 
+, %q) =0, 


(1) 


where f; is a homogeneous polynomial of degree m,; with coefficients belonging 
to the ring J of integers of a given finite algebraic extension K of the 
rational field. The main object of this paper is to prove the following result. 


TuEorEM 1. /f K is totally complex, then for every system of degrees 
m,,°* *,My and every positive rational integer m there exists a positive 
number Q=Q(m3m,,: +,mn;K) such that whenever g=Q the system 
of equations (1) has an m-dimensional linear manifold of solutions in J. 


Using Theorem C of [I],' one sees that Theorem 1 is equivalent to the 
following 


THEOREM 2. If K is totally complex, then for every positive rational 
integer m there exists a positive number Q=Q(m;K) such that whenever 
q=Q the equation 


(2) ‘ + = 0 (a, ° %ed) 
has non-trivial solutions in J (the trivial solution being Ay =- - -—=Aqg=—0). 


It should be remarked at once that the requirement that K be totally 
complex is also necessary for the conclusions in Theorems 1 and 2, since in 
the case that K has real conjugates there exist forms in arbitrarily many 
variables one of whose real conjugates is positive definite. Nevertheless, it 
will turn out to be possible to obtain a generalization of Theorem 2 for non- 
totally complex fields by adding the requirement that the form a,7,"+--: - 
+ G2” be indefinite (cf. Theorem 3 below). 

The method which I use in the proof was developed by C. L. Siegel 
[II, III] for the treatment of Waring’s problem in algebraic number fields. 


* Received November 17, 1948. 
* Numbers in square brackets indicate papers in the list of references at the end. 


387 


| 


388 L. G. PECK. 


Siegel’s calculations for the form z,"+ ~~ --+ 2g” apply with only slight 
modifications to the form «,7,"-+- - -+ a%@a_", the only essential difference 
arising in the proof that the “singular series” has a positive lower bound, 
I wish to express here my deep gratitude to Professor Siegel for his invaluable 
advice in the preparation of this paper. 


1. Notation and plan of proof. Let K be an algebraic field of degree 
n over the rationals. K),- - -,K‘" are the real conjugates and K‘"*?),--., 
Ke'"**8) the complex conjugates of K. Here r+ 2s—n. The complex con- 
jugates are so arranged that, for r+ s, the fields K™ and K's) 
are obtained from one another by interchanging i(—= Y—1) and —i. 

Elements of K are denoted by small Greek letters; their conjugates by 
the same letters with superscripts. The norm and trace are defined by 
N(a) = al) +---+ a, the same definition holding 
also for Greek letters which are not necessarily elements of K. 

If w:,° - -,, is a basis of J, then the transposed inverse (p:“)) of the 
matrix (w:)) yields a basis p:,- --,pn of the ideal d-', where 6 is the 
ramification ideal. For the absolute value D of the discriminant of K it 
follows that Di = | det (w:™)|, D=N(d). It will be important to recall 
the following property of bd: 


if and only if S(wid) = a rational integer forl = 1, 2,- -,n. 
Introducing real variables 2%, yx, yxr (kK =1,---,n; 


define 


(3) 
m= OnYnt (J =1,2,°- 


The points (2,,---,2n), (Y1,°* *,Yn), etc., are to be regarded as variable 
points in an n-dimensional Euclidean space X. The conjugates of é, 7, m 


are obtained by inserting the superscript ‘) above each of the Greek letters | 


in (3). All relations involving Greek letters without superscripts (except | 


those with N or 8) will henceforth stand for the n relations obtained by 
inserting superscripts as above. 

A homogeneous polynomial f(z:,-- +, 2) with coefficients in K will 
be called (in)definite if each of the real polynomials f(z,,- - -, 2) 


(k=1,---,r) is (in)definite. Thus, in totally complex fields, the require- : 
ment of (in)definiteness is no restriction at all. It must be remarked, how- | 


ever, that in fields with real conjugates there are polynomials which are 
neither definite nor indefinite (e. g., if 6? 2, one conjugate of — is 


asi 


| 
| 
a 
to 
| 


DIOPHANTINE EQUATIONS IN ALGEBRAIC NUMBER FIELDS. 389 


indefinite while the other is definite). The generalization of Theorem 2 
mentioned in the introduction may now be stated. 


THEOREM 3. If g=1-+ max [4m?™8, (21 4 n)mn] and if the poly- 
nomial a,2,"-+-- - --+ az” is indefinite, then (2) has non-trivial solutions 
in J. 

Note that q depends only on the degree n of K. Hence, in Theorems 
2, Q(m;m,,:-- ; K) and Q(m; K) may be replaced by Q(m; m,- - -; 
and Q(m; 1). 

If a; 0, then (2) has the solution Aj =1, —0 (kj); hence it 
will be assumed throughout that 30 (k=—1,---,@q). 

_ The first step in the proof is. to obtain an expression for the number 
N(T) of solutions of (2) satisfying |A.|<T (1—1,---,q). To this 
end, using the notation 1“ =e?" introduce the function 
f(z3a) = (aeJ;a 540), 
and define 


f(x) =f(@3 a1) %). 


From the fact that the function 18“) has, when yw is an integer, period 1 


in each of the variables 2,,- - -,2n, it follows that 
0 (u¥~0), 
S(ut) =e 

fa de (dz — dr, - dey) 

where the domain of integration is the unit cube 
B: 0<m <1 

Hence 
(4) N(T) — JS 


In what follows, a slight modification of Siegel’s generalized Farey 
dissection will be applied to the cube FE and the contributions of the major 
and minor arcs will then be estimated. In this way it will be shown that 


(5) N(T) = Tr 4. 


where the “singular series” o and the “singular integral” J will turn out 
to be positive and independent of T. In (5) and throughout the paper, the 
symbols o and O refer to the passage to the limit T’— o. 


2. The Farey-Siegel dissection. Since Theorem 3 is trivial for m = 1, 
assume m = 2 and let 


ht 
| 
d. 
dle 
ee 
n- 
3) 
by 
1g 
e 

| 
8 
y | 
| 3 


L. G. PECK. 


a= (221), Tro, 
Let 
c= max | a, |, 


k=1,...,7r+8 
J=1,. 


and suppose 7’ is sufficiently large to ensure T?* > 2cD™", 

For each y in K the ideal a, is defined by the relation ay? = (1, yd), 
and for any positive real c the domain B(y,c) is the set of points reX 
which satisfy 

N[max (T™ | €—y|, e7)] S #*N(a,7). 


(The domain B, in [III] is here denoted by B(y,1).) Clearly, B(y, ¢) is 
empty if N(a,) > (ct)"; moreover, B(y,c) and B(y’,c) have no common 
point when yy’, for if z were a common point of the two regions and 


max (T"|£—y|,c*)— tot, max |, — to", 
then 
oS ct, cl, N (aya) = N(o0’), 
and 
| ST +0") 
S 
N((y—Yy) aay) S |) N 
(Sere) < D*; 


this would mean that the integral ideal (y — y’)a,a,-d has norm less than 1, 
which is impossible. 


Because of the periodicity of f(z) it is clear that whenever y=y | 


(mod d-*), 
dz = dz. 


Consequently, if T is a complete set of modulo )-* incongruent numbers y 
satisfying N(a,) S (ct)”, and if Xo(c) is the subset of X consisting of those 
points which (for the particular c under consideration) do not lie in any 
B(y,¢), then (4) may be written in the form 


where y (here and in the sequel) runs over the finite set I. 


fo 


an 


(7 


390 
I 


F 


DIOPHANTINE EQUATIONS IN ALGEBRAIC NUMBER FIELDS. 391 


3. The minor arcs. The last integral in (6) will be estimated in the 
three lemmata of this section. 


LemMaA 1. If a is an integer, a0, then 
SN(|a|)N(ay*). 


Proof. Qay* consists of all numbers of the form A-+ ayé where AJ, 
$e); on the other hand, aa,~* consists of all numbers of the form a(A + y8) 
where again Ae J, Sed. Hence | aay? and the lemma follows. 


LeMMA 2. For the integer a (0) let 

If |a| Sc and re X,(c), then a’ e X(1). 

Proof. Let y’=ay. Since 

N[max (7" | —y|,¢*)] > (ay) 

for every y in K, it follows that 

N{max | | ,1)] = |é—y|, | 

> (| |)N = (ay) 

for every y’ in K. This proves Lemma 2. 

Lemma 3. If xe Xo(c) and |a|Sc then 

a) == 


Proof. By Lemma 2, z’e¢X,(1). Now the proof of Lemma 8 in [III] 
may be used with the following modifications: a is to be replaced by 2’, é by &, 
and the set & by the set of integers A satisfying |A| < T. 


Lemma 3 makes it possible to write (6) in the form 
(7) RT) de + 0( Tm), 
B(y;e) 


4, The major arcs. 
Lemma 4. For any integer «a (+40) set 


Ga(y) 


umodary 


| 
| 
| 


392 L. G. PECK. 


where » runs over a complete set of incongruent integers modulo the ideal ay. 
If xe B(y,c), then 


= Ga(y) dy + 
lja|<T 


where is defined by (3) and dy = - dyn. 


Proof. Since N[max StN(a,*) and N(a,) 
(ct)", one can determine positive numbers - -, with 9) == 9 (is) 
(k=r+1,---,r-+ 8) such that 


6max (TJ |é—y|,c7) StD/*", N(0) = (ay). 


By Minkowski’s theorem, the ideal a =a, contains a number @ satisfying 
Then aa? —b is integral and V(b) so that b belongs 
to a finite set depending only on K. The basis Ai,- - -, Bn of 6b gives rise 
to a basis & (k—=1,---,n) of a; and, since —O(1), it follows 
that % == 0(6). 

Clearly 


(8) f(z; a) = Dd 1 


where p» runs over a complete set of integral residues modulo a and A runs 
over all elements of a which satisfy |A + |< T. Introduce the variable 
point z in X and let {= Gz, +---+ If Gn%n 
with rational integers g;,- - -,9n then the cube F(A) may be defined as the 
set of points z satisfying g.S 2% <ge+1 (k=1,---,n). It follows that 
when ze E(A), 

= O(1), 


and 
= |A+e|"") 
= a6(é—y)O(T™) = atT-"0(T™") = O(T*) ; 
hence 
(9) ySla(r+u)™ (Ey) dz + O(T*). 


E(x) 


For a fixed » the conjugates of any which occurs in the inner sum in (8) 
all lie in fixed intervals of length O(7). Hence the number of these A is | 


N(a)O(T”), and therefore the error term which results when (9) 38 
summed with respect to A is N(a*)O(T"*). 


TH 


( 
| 
T 
of 
ler 


DIOPHANTINE EQUATIONS IN ALGEBRAIC NUMBER FIELDS. 


For real wu, let F(u) be the domain defined by the inequality | {+ y| 


<T-+ 6; then the volume of F(u) is 


ao 
There exists a positive constant A = O(1) such that 
F(—A)C C F(A); 
and, on the other hand, 
F(— A) C F(0) C F(A). 
Since (10) implies that 
V(A) — V(— A) =N(a")T™'0(6) = N(a")O(T**), 


it follows from (9) that 


 F(0) 


Using (3), substitute +» —vy in the above integral. Since 
2n)/O(Y1,° Yn) = N (a), 
the lemma is a consequence of (8) and (11). 
Lemma 5. For N(a,) the estimate 


Ga(y) = N(a,/™")0(1) 
holds. 


This will be proved in 7. 


LEMMA 6. 


f 1Slen™ (§-9) == N[min (1, T-* | 
Inl<T 


This follows in the same way as equations (80)-(83) in [III]; the details 


of the proof are therefore omitted. 


To obtain an estimate of the first term on the right side of (7), set 
G(y) = Ga,(y)° and du: = dyn Then 


lemmata 4, 5, 6 imply 


10 


N(T + u6) 0), 
(otherwise). 


) 
| 
| 


394 L. G. PECK. 


(12) f(x) = G(y) f (EVI du, 
Im|<T 
+ + N(ay-0/m) N[min(1, | ¢(€—y)| ]O( Tare), 
Moreover, 
B(¥,c) 
f N[min(1, | of | O(T-™), 
x 
and (since g—1 > 2m) 
(14) TN (ay0/m) — O(1). 
a 
Using the obvious estimate }} | dx=1 and the fact that a(qg—1) = mn, 
B(y,c) 


it now follows from (7), (12), (13), and (14) that 
N(T) = 


= G(y) f f f dug + 
B(y,0)% |m|<T Ina <T 


where » = "+: +--+ agyq™. As in [III, p. 336], it is now possible to 
replace the integration domain B(y,c) by the entire space X with an error 
of Thus, 


(15) R(T) —o(T)1(T) + 

where 

(16) o(T) —2 G(y) 

and 


6. The singular integral. If the substitutions 7, Tm (1 
and —T7-™é are made in the integral (17), there results 


I(T) = (1). 


It will be shown in this section that the singular integral I(1) = J, which is 
clearly independent of 7, is positive. The following two cases require separate 
consideration: (i) 7 >0 and m=0 (mod 2); (ii) all others. 


it 


0 
t 
a 
ve 
( 
Cé 
T 
(1 
| 
W 
( 
T 


DIOPHANTINE EQUATIONS IN ALGEBRAIC NUMBER FIELDS. 395 


Let wu; denote the point (y11,- --,Yn:) in X, and consider the set € 
of points (w,°**,Uq) in an ng-dimensional Euclidean space defined by 
the inequalities 


<1 O<m <1 
—a/m in case (i); 
| <1 1,--+,9), <argyg® 
in case (il). 
Further, let 


(é) 18(8) du, + + dug 


where w = 2"m* in case (i), w =m in case (ii), and w has the same meaning 
asin 5. If J(p) = f &(é)1-S)dzr (the integral being absolutely con- 
x 


vergent by the reasoning at the end of 5), then clearly J =J(0). 
For a fixed w—o,t, +--+ ++ ontn, let E(w) be the set of points 


+, in an n(q—1)-dimensional Euclidean space for which ug 
Then every (w,° * -,Uq) €€ gives rise to a unique w and every w and 
+, Ug1) € E(w) gives rise to a unique Therefore, since 


it follows that 
®(é) —wf V(w) dt, 


where 


It is clear that ¥(w) vanishes outside of a certain bounded region, e. g., 
=0 when | o® | > |+---+]a for some k. The con- 
tinuity of ¥(w) is, moreover, a consequence of the following calculation. 


In case (ii), set 


Then, for 1 =1,- - -,q-—1, the Jacobian of the y’s which have second sub- 


396 L. G. PECK. 


script 1 with respect to the v’s and ¢’s which have second subscript 1 is 
D-*N (m= | a | | 2-™), and 


(0) DOW m-m | N(ay) N(a)| Fe(o™) T 
k=1 k=r+1 


where, for k—1,-- -,r, 
F,(o™) f | OF Vgi(o™ — Vq-1) | (1-m)/mdy, 


the integral being extended over the domain 


and for 


— f Vg-1 (wo) v,2e't — — Vq-1 | (1-m) /m 


the integral being extended over the domain 


(I=1,---,q—1), 
In case (i), the domain ©(w) may be split into 27-1) domains, in each 
of which the quantities 4: (kK=1,---,r; 1=1,---,qg—1) are all of 
constant sign. In any one of these domains, new variables may be introduced 
by the formulae 
| | ar (41) ™ = 
and, as before, one obtains 


V(w) = | N(a,) N (aq) | TI 
=r+ 


where, for k ~1,-- -,7r, 


F,(o™) = f | ° — (sgn (*))y, —- 


(sgn q_1™)) | (1-m) /m dv, 8 


an 


the 


7 
t 
i 
se 
= 

wh: 


DIOPHANTINE EQUATIONS IN ALGEBRAIC NUMBER FIELDS. 


the integral extended over the domain 
0< <[a™| (J=1,---,q—1), 


0 < [w'*) (sgn a, (sgn 09-4] /aq™ 1; 


(19) 


and H;,(w)) has the same meaning as in case (ii). 

Classical considerations show that 7;,(o) and H;,(o)) are in both cases 
continuous. Hence (ww) is continuous and Fourier’s theorem shows that 
J(o) =w¥(o). Thus, J = wv¥(0). 

It is clear that the quantities H,(0) (kK=r-+41,---,r-+s) are 
positive; also in case (ii) F,(0) >O0 (kK=1,---,r). In case (i), the 
requirement that «,2,"-+----+ az" be indefinite means that for each 
fixed & in the interval 1= not all of the quantities - -, a, 
have the same sign. Hence the domain (19) with wo) —0 has a positive 
volume and again F,(0) >0 (k=1,---,r). Thus, in any case, 
I=wv¥(0) > 0. 


7. The singular series. The following four lemmata have as their 
immediate objective the proof of lemma 5, but will be used later in this 
section for the estimate of the quantity o(7’) in (16). 


LemMaA 7. If (ay, ay) =1 then ayy =a,ay and 


+’) = Galy) Ga(y’). 
Proof. The hypotheses (ay, yday) = 1, (a,, ay) imply 
1 = (ay,ydayay) = (ay, (y + y’)dayay’) . 
Similarly, 


1 = (ay, (y + y’) da,a7) 
and therefore 


(ayay’, (y + y’) daya’) 
1. @., 


Oy = (1, (y + y’)d) = ayy. 
If A, d’ are integers such that 
ay | A, Ay’ | (A, ay’) 1, ay) 1, 


then 
umodayay’ 


Hmoday 


N (Gy) N (Gy) Ga(y) Galy’), 


mmoday “’moday’ 


which proves the lemma. 


397 

| 


398 L. G. PECK. 


Let p be a prime ideal, and for any integer e let P(«) be the rational 
integer for which | and p?(*1 a. In what follows, is any integer 
satisfying P(r) =1. 


Lemma 8. a,=p', 12=2P(m) + 2, then 
Gi(y) 


Proof. 
N(p')G,(y) = S18" 


p!-P(™-1 ymod pP(™ +1 
1S (p™+mp™-1 


p!-P(™)-1 vmod pP(m) +1 


The inner sum equals N(p?(™+1)18@™) if | if p| p; 
otherwise this sum vanishes. Hence 


= 
Amod p!-P(m +1 
Diu 


= = N (p?-P(m)-2) G, (xy) 


The last step is clear when m= P(m) +2. If m < P(m) + 2, it must be 
observed that 


N(p") G, (xy) > 
umod 
pmod pP(m+2-m 
N(pP(™)+2-m) > 18 (many) | 
“mod 


so that the step in question is again valid. This proves the lemma. 
LemMa 9. If ay=—p', 1=2P(ma) +2, and the rational integer 
satisfies 1— (k —1)m = 2P(ma) + 2, then 
Ga(y) = N(P*) 
Proof. Since day—=p'?™, it is clear that Ga(y) =G,(ay), and the 
desired result follows by repeated application of Lemma 8. 


Lemma 10. If ay—p and N(p) is sufficiently large (specifically, tf 
N(p(m-2)/2m) > m—1 and pf«), then 


| Ga(y)| 


Proof. Let = be the set of characters of the multiplicative group A of 


eV 


| 
E 

( 

S 
| 


al 


every prime ideal p, cp > 0. 


DIOPHANTINE EQUATIONS IN ALGEBRAIC NUMBER FIELDS. 399 


integers modulo p. If = ay, then as—p and, for any non-principal 
character yx &, 


AeA AcA weA 


=> = ly (y) N(p). 


AcAveaA \modp 
Hence 


Ga(y)| =| 1 + 


AcA 


=[1+ 18% (u)| S [(m, N(p) —1) — 1] (p%) 


wed 
xm=1 
= (m—1)N(p*) S 


which proves the lemma. 

Using Lemma 9 with the largest permissible value of k together with 
Lemma 10, one sees easily that when a, =p! and N(p) is sufficiently large, 
| Ga(y)| S N(p-’"). On the other hand, if p is one of the finite number 
of prime ideals whose norms are not “sufficiently large” then Lemma 9 
shows that Ga(y) = V(p-”™)O(1). These facts, together with the multipli- 
cative property of Ga(y) proved in Lemma 7 lead at once to Lemma 5. 

An immediate consequence of Lemma 5 is that the singular series 


em GQ) 


dmoda-! 


is majorized by N(a/™) (the latter sum running 
a 


dmoda-? 
over all integral ideals a) and is therefore convergent (since g > 2m). 


Hence o(T) =o + 0(1), and Theorem 3 will follow when it is shown that. 
0. 
Let H(a) = G(8). Lemma 7 shows that H(ab) = H(a)H(b) 


6moda-? 
(1,63 )=a-1 


whenever (a,6) —1, and thus, because of the absolute convergence of all 
series involved, the Euler factorization of o— > H(a) is given by 
a 


(20) o=I[[op (p runs over all prime ideals), 

where 

(21) op => H(p'). 
1=0 


Since the product (20) is convergent, it is now sufficient to show that, for 


; 


400 L. G. PECK. 


Lemma 9 shows that H(p') = N(p"*)H(p'™) whenever | = 2P(ma;) 
+2 (j=1,:--+,q) andl2m. Hence writing 


Q = max [2P(ma,) +1,- +, 2P(maq) + 1, m—1], 


it is clear that 


Q-—m 
(22) op— + LEN + 


1=Q-m+1 


where the first term is to be deleted when Q = m — 1. 
The partial sum 


k 
> = N(p*) S | 71 
1=0 y ymodad-! modp* p* 

1|yap* 


is easily computed with the aid of the formula 


N(p*) if a, 
0 otherwise, 


(1a). 


5moda-! 
1| dap* 


The result is 
where M(k) is the number of solutions of the congruence 
(23) + + Aqug™ = 0 (mod p*). 
Consequently, (22) yields 
(24) oy = N(p"™2)) 
— N (pa M (Q — m)]. 


Let M,(k) be the number of primitive solutions of (23), i.¢., solutions 
satisfying 1. Any non-primitive solution of (23) may be 
written in the form 


(25) = = Agr (mod p*), 
and (25) is a solution of (23) if and only if (for mS;k) 


Since M(k— m) counts the number of sets of \’s modulo p*", whereas in 
(25) they must be counted modulo p**, it follows that 


M(k) — Mo(k) = N (pt )M(k —m) (k= m); 


shc 


| 
T 
t] 
cl 
al 
( 
| 
fc 
m 
N 
m 
wi 
(2 
If 
in 
(n 
(n 
the 
an 
als 
tog 


DIOPHANTINE EQUATIONS IN ALGEBRAIC NUMBER FIELDS. 


and therefore, from (24), 
op = (1— 4) 


To prove op >0, it therefore remains only to show that M,(Q) >0. The 
concluding argument will actually yield M,(k) > 0 for every k. If P(a;)= k, 
then (23) has the solution =1, (147). Hence it will be assumed 
in what follows that k > P(aj) (7 =1,:--,@q). 

An equivalence relation for integers in K is defined by the statement, 
a~B whenever P(«)=P(8) (modm). Since there are m equivalence 
classes, it is clear that if qg’ is the smallest integer not less than g/m then 
at least g’ of the integers a,,---,@ belong to the same class. Suppose 
without loss of generality that P(aj) + mh; 
(j=1,---,9’), OSho<m. Then integers 8; can be found satisfying 
aj = Bs (mod p*), (Bj, p) =1 (j= 7). 

Consider now a second equivalence relation for integers of K defined 
by the statement, « ~ 8 whenever (a, p) = (8, p) = 1 and (mod p*) 
for some integer A. If N, is the number of solutions of the congruence 
A" = 1 (mod p*), then (N(p*) — N(p*"))/N; is the number of incongruent 
m-th power residues modulo p* which are relatively prime to p, and therefore 
N;, is the number of equivalence classes under ~. 

To estimate N;, observe first that NV, = m, since the multiplicative group 
modulo p is cyclic. It is also clear that = N(p)Nar (k =2,3,-- -), 
whence 


(26) Ny < N (pe) m 


If k=2P(m) +1 and A” =1 (mod p*), then » =A (mod p*-P(™)) implies 
+ y (mod p***), whence w™ = (A + = = 1 (mod p*) 
independently of v, whereas, writing A" — 1 = (mod p**") and m B 
(mod p***) with (8,p)—1, the congruence 1=p" =A" + mrP(m)m-1, 
(mod p**) is seen to be satisfied only when a + BA""y=0 (modp). Thus 
the number of solutions of »” =1 (mod p*), »=A (mod pk-P(™)) is N(p?(™) 
and the number of solutions of ==1 (mod (mod js 
also N(p(™)); in other words, when k=>2P(m)+1. Taken 
together with (26), this information yields the inequality 


N (p??(™)) m = N(m?)m m2ntt, 


If q” is the smallest integer not less than q’/m?"*!, the last inequality 
shows that at least g” of the integers - -, belong to a single equi- 


401 


402 L. G. PECK. 


valence class under ~. Suppose without loss of generality that B, ~--- = By. 


and h,=---<hq. Determine integers y; all relatively prime to p such 
that y;"8; = By (mod (j=—1,- - -,q’—1), and set 
(27) pees 
Lo +1,---,q). 


Then (23) becomes 
+1) = 0 (mod p*), 


which certainly holds if - are rational integers satisfying 
(28) Vq"-1™ (mod pr’) g 

re 
where p is the rational prime divisible by p and k’ = k— P(aq). i 


Since g = 4m?"*5 1, it follows that q’ = 4m?"*? + 1 and q” = 4m +1. 
But it is known that every rational integer is congruent modulo an arbitrary 


rational prime power to a sum of 4m m-th powers. Hence (28) has indeed t 

a solution. This may be substituted in (27) to obtain a solution of (23) t 

which, since yg = 1, is certainly primitive. h 

fi 

THE JOHNS HOPKINS UNIVERSITY. tl 

d 

ce 

th 

REFERENCES. J. 

n 

[I] R. Brauer, “A note on systems of homogeneous algebraic equations,” Bulletin of he 
the American Mathematical Society, vol. 51 (1945), p. 749. 

[II] C. L. Siegel, “Generalization of Waring’s problem to algebraic number fields,” [| , 

American Journal of Mathematics, vol. 66 (1944), p. 122. of 

(IIT) , “Sums of m-th powers of algebraic integers,” Annals of Mathematics, | +h, 

vol. 46 (1945), p. 313. Ee 

th 

fu 

M 


SOME DIOPHANTINE ASPECTS OF MODULAR FUNCTIONS.* 
I. ESSENTIAL SINGULARITIES. 


By Harvey Coun. 


1, Introduction. The modular group is a natural invariant involved 
in many problems in diophantine approximation. Some very basic properties 
of real numbers a2 are preserved under the transformations of the modular 
group: a’ = (ax + b)/(cx +d), where a, b, c, d are integers satisfying the 
relation ad— bc = 1. To cite a few of these properties, we might mention 
the rationality of 2 and the boundedness of the partial quotients in the simple 
continued fraction for 2. 

Although any real function of the real variable x, invariant under the 
transformations of the modular group, is either constant or totally discon- 
tinuous, there are functions of a complex variable 2, analytic in the upper 
half plane, and possessing the desired invariance. Aside from the constant 
function, these so-called modular functions all have essntial singularities on 
the real axis. Our problem is then that of investigating the influence of the 
diophantine character of a real number 2 on the essential singularity of 
certain modular functions at 2. 

Hardy and Littlewood [3], [4]? carried out such an investigation for 
the purpose of setting bounds on certain exponential sums (given by the 
Jacobi theta functions on their boundary circle) by studying these functions 
near the singularity in question. Conversely, Wintner [7] analyzed the 
behavior of the Dedekind y-function by considering the coefficients in its 
power series expansion. Although we present a discussion of the singularities 
of certain modular functions as related to the singularities of the terms of 
the infinite series representing them, our objectives will be primarily in the 
realm of function theory, since we shall, for instance, demonstrate function- 
theoretical properties in the large on the basis of the behavior of modular 
functions near singularities. 


* Received April 21, 1948; revised November 20, 1948; presented to the American 
Mathematical Society, June 19, 1948. Based on the author’s doctoral dissertation 
(Harvard, 1948). 

*Numbers in brackets refer to bibliography at the end of the article. 


403 


404 HARVEY COHN. 


2. Outline. We choose as the particular modular functions, the Poin- 
caré theta functions, f(z), defined by the property 


(1) f((az + b)/(cz + d)) = (cz + d)™f(z), 

a property which generalizes the behavior of Jacobi theta functions. We 
shall use as our starting point some well known results of Poincaré [6], 
which we summarize in the next section. 

Poincaré theta functions interest us because they have the advantage of 
being easily obtained as infinite series whose individual terms often suggest 
the behavior at essential singularities. For instance, we can take the Eisen- 
stein series, 


(2) Fom(Z) = (cz 
(c,d) (0,0) 

where (c,d) runs over all integer pairs except (0,0), and where m is an 
integer = 2. This is a Poincaré theta function in which each individual term 
becomes singular only at one rational point, z—p/q. At such a point, 
Fom(z) behaves like (z— p/q)-*™ if z approaches p/q in a Stolz neighbor- 
hood? at p/g. Rational points, furthermore, are characterized by this 
behavior, as we shall see in subsequent paragraphs. 

If we were interested in the behavior of modular functions at z=6, 
where 6 is a real number whose continued fraction expansion possesses 
bounded partial quotients, we would find it natural to consider a sum of the 
type (Az? + Bz+C)-™, where the summation goes over all forms equi- 
valent, in the classical sense, to a given indefinite form. This, too, would 
generate a Poincaré theta function. The individual terms of its series have 
singularities only at certain real quadratic irrationals, which ac ~redly have 


continued fractions with bounded (in fact with ultir +-l, ~ uc) partial 
quotients. We shall single out for special attentio: * .. conglomerate 
Poincaré theta function, 
(3) (c+ (a+d)z+b) -G 

ad-be=1 
where the summation extends over all “ substitut >in ie modular group. 
This function behaves at real irrational numbers 6.  .  atinued fractions 


have partial quotients like O(z—9@)-*™ as z approache — in a Stolz neigh- 
borhood. In fact when m = 6, and only then, the beh.» or is ~ (2 —6)-™.? 


* A Stolz neighborhood of 6, in the present context, is a region of the z-plane with 
6 as a boundary point and contained in a sector of the form y/|«—6| >, where 
A> 0. 

* We write f(z) ~g(z) or f(z) X g(z), when f(z)/g(2) approaches a constant or 
remains bounded away from 0 and infinity, respectively. 


A 
t 
W 
be 
re 
b 
it 
f 
f 
Sp 
pl 
ex 
Te 
el 
Te 
| 

to 
ur 
in 
po 
J 
va 
As 
Tl 
th 

Wi 


SOME DIOPHANTINE ASPECTS OF MODULAR FUNCTIONS. 405 


As we shall note later on, this behavior is also, in a sense, characteristic of 
this type of irrational. 

It should be pointed out, however, that the term by term analysis by 
which we guess at the behavior near essential singularities can not always 
be readily supported. True, we can justify the term by term analysis with 
regard to the rational singularities of F2m(z), but we shall not even attempt 
by a term by term analysis to explain how Gzm(z) can vanish identically, as 
it does when m= 2,3,4,5,or?. We shall, therefore, have to rely on 
function-theoretical as well as analytic methods, expressing Poincaré theta 
functions in terms of the better known elliptic modular function J(z). 

We shall see that the results applying to Fam(z) and Gom(z) are just 
special cases of results for general theta functions, which we shall establish 
presently. We shall also see that the asymptotic behavior, to a certain 
extent, can describe the function and the singularity. 


3. Preliminary results on theta functions. We shall now collect some 
results on the convergence of Poincaré theta functions. These results are 
either proved in the literature or are immediate consequences of such results. 

The modular group defined in 1 is generated by the substitutions 
2>2+1andz—>—1/z. It has a fundamental domain D consisting of the 
region 0¢* of the upper half plane defined by the inequalities | z| > 1 and 
| fz | <4, together with the boundary points of ®* identified according 
to the two transformations z—>z-+1 and z——1/z. Every point: z of the 
upper half plane is equivalent, modulo the modular group, to a single point 
in the fundamental domain . (The point at infinity, being a boundary 
point, is excluded from the upper half plane.) 

We define the elliptic modular function, with Dedekind, as the function 
J(z) which maps the fundamental domain onto the whole z-plane so that 
J(o) =o, J(i) =1, and J(4(1+%V3)) The mapping function, 
J(z) by virtue of the shape of the fundamental domain, will take on the 
value unity twice at zi, and will have a triple zero at z—=4(1+iV3). 
As %z becomes infinite, J(z) likewise will behave like 


(exp — 27iz) (ko + exp + exp -). 


The exact value of ko, namely 1/1728, can not be immediately inferred from 
the shape of the fundamental domain, but we shall concern ourselves only 
with the more evident fact that kj) #0. From its definition, J(z) is invariant 
under the modular group. It can be further shown that any single-valued 


406 HARVEY COHN. 


function of z, invariant under the modular group, is a single-valued function 
of J(z). 

A Poincaré theta function is defined as a single-valued function f(z) 
regular in D with the exception of a finite number of poles, and satisfying 
the transformation formula (1). We call m the order of the Poincaré theta 
function. Thus J’(z) is of order 1, and f(z)J’(z)-™ is a single-valued 
function of J(z). In particular, if it is known that f(z) has no singularities 
in the upper half plane, it easily follows that the only poles of f(z)J’(z)-™ 
are those induced by the zeros of J’(z), and these can be easily accounted 
for. Thus we can write 


(4) f(z) = (z)-mo( T(z) —1)-™P (J (z)), 


where P(w) is an entire function of w. Suitable values of m, and m, are 
given in the following table [6; 599]: 


m = 0 (mod 6) My = 2m/3 m,=m/2 Ms, = m/6 
1 2(m —1)/3 (m — 1) /2 (m — 7) /6 
2 (2m —1)/3 m/2 (m — 2) /6 
3 2m/3 (m — 1) /2 (m — 3) /6 
4 2(m —1)/3 m /2 (m — 4) /6 
5 (2m —1)/3 (m — 1) /2 (m — 5) /6 


The Poincaré theta functions, having the period 1,- can always be 
expanded into a Laurent series in exp 27iz valid for Yz positive and large. 
In some cases the series is finite in one direction so that 


f(z) = (exp — 2mirz) + exp + 


where k’, £0 and r is an integer. (This occurs, for instance, when f(z) is 
bounded as 3z—> oo.) Then we say f(z) is of degree r. Thus if f(z) has 
no poles the function P(w) introduced in equation (4) is a polynomial of 
degree r+ m., where m. is given in the above table. For other function- 
theoretical aspects of theta functions, too numerous to recall here, we refer 
to Poincaré’s papers. We are principally interested in the tables just 


presented. 


4. Preliminary results on theta series. We aré now ready to consider 
some examples of theta functions expressed as infinite series. We consider 


the series, 


* The integer m, will be explained presently. 


( 
a 
( 
Sé 
a] 
F 
fi 
by 
al 
fi 
(2 
W 
of 
we 
( 
wl 
te 
(6 
Si 
sel 
no 
W; 
me 
the 
m 


OO 


SOME DIOPHANTINE ASPECTS OF MODULAR FUNCTIONS. 


(2) Pom(z)—= 


(c,d) 34(0,0) 
and 
(5) @am(2) (cz (az + b)/(c2 + 


where m = 2 and the first and second sums are over all integer sets that 
satisfy the conditions stated under the summation. In either case it is 
apparent that we are dealing with a theta function if we can assume absolute 
convergence, since a rearrangement of terms shows that (1) is satisfied by 
Fom(z) and @om(z). 

In a region 0, bounded away from the real axis and having only a 
finite projection on the real axis, F2m(z) converges absolutely and uniformly, 
by well known estimates. 

If, in the series for @2m(z) as given in equation (5), we sum first over 
all a, b satisfying a=a(modc) and b=£(modc), a, B, c and d being 
fixed, then we obtain 


(5a) @am(z) S* (cz cot (az + b)/(cz + d)), 


ad-bc=1 


where the asterisk means that a and b run independently over complete sets 
of residues modulo c. We now assume once and for all that So = 0; thus 
we avoid poles. The series (5) with appropriate grouping of terms, or series 
(5a), converges absolutely and uniformly in the region #@ mentioned above. 

If we differentiate the series (5) term by term we obtain a series in 
which the convergence is now absolute and uniform, without grouping of 
terms, in the region #&. Thus we define 


(6) Gam (2,0) = — — 1)! 
= > (czw + az + dw + 6)-™. 


Since Gzym(z,z) is precisely Gom(z), we see how to obtain the latter theta 


series from 


5. Non-vanishing of theta functions. We are now faced with the by 
no means trivial problem of deciding if our theta series vanish identically. 
Without such a decision, the problem of asymptotic behavior may be 
meaningless. 

The case of F2»(z) is easily disposed of. If z approaches infinity along 
the imaginary axis, by the uniformity of convergence of its series, P2m(z) 
must approach the part of its sum consisting of constant terms, which makes 


407 


408 HARVEY COHN. 


for a limit of 2g(2m). Hence F2m(z) is not identically zero for any integer 
m = 2. 

Next consider @2»(z) as written in the form (5a). By the uniformity 
of convergence again, we can let z become infinite along a line parallel to 
the imaginary axis and find that @.(z) approaches — 2zi, (since only the 
terms with a—=+1, b=0, c—0, d—=-+1 then remain in (5a)). But 
now, in expansion (4), mz, 0 or P is a constant when @.m(z) is expressed 
in terms of the elliptic modular function for m = 2,3,4,5 or 7. The same 
is true for Fom(z). Hence comparing the values at infinity, we find 
@2m(z) — Thus, despite appearances, @2m(z) does not 
depend on w, and its derivative with respect to w must be identically zero; 
in particular so are Gom(z,w) and Gzm(z), for these small values of m. 

To see the dependence on w of @2m(z) for all other values of m, we 
consider the expansion of this function into powers of exp 27iz and exp 27iu, 
which leads to the following formula [6; 615]: 


— (2) (4ri)“* — Fam (2) (4E(2m))* Ane exp Rai (ne + ke) 


(7) Ane = + (—)™2e (nk)¥/c) 


=1 


D> exp 2ai(ka + na’) /c. 


aa’=1(c) 


Here 8, is the Kronecker symbol, and Jom-1(z) is the Bessel function of 


indicated order. As noted by Poincaré, when m = 2, 3,4,5 or 7%, each Ay 
is zero, which leads to some rather remarkable identities. We are more 


directly concerned with showing that A,,0 if m=6 or m= 8, whence | 


none of the derivatives of @2m(z) with respect to w are identically zero. To 


see this, we need only estimate the Bessel functions occurring in (7) to show | 


that, when n =k —1, the second term in the expression for A, is much 
less than §,, (=1). It is sufficient, for our purposes, to know that the 


exponential sum in (7) is less than c. The tedious details of the estimations | 


can be omitted, since they simply consist of taking enough terms of the power 
series for Jom-1(z) about the origin, so that the remaining terms in the 
expansion will constitute an alternating series. We find that two terms suffice 
in this case. We easily make the estimate A,,; =1-+ (—)™2aJom-1 (42) 
+ error of less than .01. Furthermore, J,,(4r) = .291: 
= .159--- (rather close to (27)-*), and Jom1s(4r) < .05 when m28. 


This finally establishes the dependence on of for m=6 and m2=28. 


The non-vanishing of these A,, even shows that in Gom(z,w) the leading 
term exp 27i(z-+-w) has a non-zero coefficient. 


or 


f 
| 
| 
€ 
0 
| 
| 
te 
Bo 
an 
ta 
+ 


ng 


SOME DIOPHANTINE ASPECTS OF MODULAR FUNCTIONS. 409 


Collecting the results of the above calculation, we find that Fam(z) is 
of order m and degree 0 for m= 2, while Gam(z) is of order 2m and of 
degree — 2, for m= 6, or m=8. The expansions in (4) become 


(8a) Fom(z) = J’ (z)™J (z)-™(J (z) —1)-™ (J (2) ™ + 
hing”) 26(2m) (— 

and, by symmetry, 

(8b) Gam(2, 0) (2) "J (2) —1)-™Png a (2)) 
(w) "I (w) (w) —1)-™P mys (J (a) ) 


where Pm,-1 is a polynomial of indicated degree. Setting w =z, we find for 
m=6 and m= 8 


(8c) Gam (2) [J’(2) (2) (T(z) — 1) a (J (2))F 


The expression for G,.(z) in the equation (8c) is of some interest since 
it is proportional to the square of J’(z)°J(z)-*(J(z) —1)-%, which in turn 
is proportional to the twenty-fourth power of »(z), the Dedekind »-function. 
To see the last statement, we combine the result [2], J(z) = F4(z)n(z)-*4/1728 
with the expansion for /’,(z) in equation (8a). 

We see from the table in the last section that only Poincaré theta 
functions of order 0,+ 6,+12,---, can be devoid of zeros and poles. 
It is also clear that G,.(z)* is such a function when & is an integer. Thus 
the theta function f(z) has no zeros or poles if and only if F(z) = Gy2(z)* 
exp g(J(z)), where g(w) is an entire function of w. Hence among functions 
of finite degree, the only theta functions devoid of zeros and poles are G1.(z)* 
times a constant. 


6. Asymptotic behavior and term by term estimation. The function 
Pom (z) = (cz +d)", m = 2, has the advantage of permitting a term by 
term analysis of its behavior at rational points. Specifically, let z approach 
p/q in a Stolz neighborhood, or let z— p/q =p: + tp2 where p, and pe are 
real, p2 > 0, | p2/pi:| =A, (a parameter of slope of the Stolz neighborhood), 
and p;? + p.2—>0. We break up F'2m(z) into two parts, one part F“) con- 
taining just those terms which become infinite at z= p/g, and another part 
F®) consisting of the remaining terms. 

Then F“ is the summation restricted to those terms where c/d = — q/p, 
or FO) = (z— (2m). 

It remains to estimate the terms where c/d ~—q/p. Setting z= p/q 
+ pi+ ipo, we find 


11 


— 
y 

1e 

t 

d 

le 

d 

ot 

re 

f 

nk 

€ 

0 

h 

e 

ns 

er | 

he | 

ce 

r) 

8. 


410 HARVEY COHN. 


FO = (cp/g+d-+ cp: + icp2)™. 


cp-dq¥0 


It is majorized by its series of absolute values, 


qe” L (cp + dq + ¢pig)* + (cgp2)?]-™. 
Now the terms may be grouped as follows: Since (p,q) =1, cp + dq takes 
on each non-zero integer n for a family of values (c,d) of the form 
C= Cn+ nq, d = d,— pp, where (Cn, dn) is a special solution to cp + dq=n 
determined by the condition 0 = ¢, < q, and yw takes on all integral values 
We may now take our majorant as follows: 


(9a) | FO |< SYS (Ln + + 


n=-0© 


+ [p2q?(u + ¢n/q) 
where 0. 


We consider the summation on n for p» fixed. The quantity A,—n 
+ piq?(u + ¢n/q) has the difference — An =k + Cn), which 
differs from k& by less than 4, if pig? < 4, as we shall now suppose. Hence 
the sum 


p=- 0 
majorizes the sum in (9a) except for those terms of (9a) where, for 
given p, n takes on such a value as to make |A,n| <4. But clearly, if 


so, | en/q)| >| m|—4 and + ¢n/q) > | po/pr|(| | —4). 
Adding this contribution to S, we get 


+ (| m |— 3)?" (p2/p1)-™. 


Now the first part of the right hand side of (9b), or S,, can be majorized 


by an infinite integral if we observe that the summation on yp is of the E 


type where ¢(m) decreases monotonically as || increases. Thus, 


adding a term to take care of ¢(0), we obtain the estimate 


nN=- 


and, changing the variable of integration, we obtain 


7 
1 
| 
Bite 
cl 
| 
n=- © 3 ey 


SOME DIOPHANTINE ASPECTS OF MODULAR FUNCTIONS. 411 


Since we are in a Stolz neighborhood, as z— p/q (=p: + tp.) 0 the 
ratio | p2/pi | =A, hence = = O(z— p/q)+, whence finally 


(9) Pom(z) = (2m) (2 — p/q)°™" + O(z — p/q)*. 


(The error term will shortly be improved to 0(1).) 

As in the case of the identically vanishing theta functions, such as 
G,(z), the singularities of individual terms do not always reflect the behavior 
of the series. In general, we shall have to resort to expansion in terms of 
modular functions to describe behavior at singularities. 


7. Use of modular functions: rational singularities. We now consider 
a theta function F(z) with or without poles and of order m and degree r, 
Then as z > p/q, a rational number (in lowest terms), in a Stolz neighborhood, 


f(z) ~ Co(z— p/q)-*™ exp 2airq-*(z— p/q)™. 


Here the constant Cy is k’oq-?" exp — 2mirp’q, where pp’ =1 (mod q), and 
k’, = lim f(z) exp 27irz as 
The proof of this result is quite simple. We write 


a’ = (p'z —q’)/(—qz + p) where pp’ —qq’ =1. 


Then Jz’— co as z—-p/q in a Stolz neighborhood. Now since f(z’) 
=f(z)(—qz+p)*", then f(z) The. theorem 
follows from the asymptotic expression for f(z’) combined with 2 =— p’q 
p)”. 

We observe that the error is O(exp[— 2mriq?(z— p/q)]), which 
approaches zero faster than any power of (z—p/q). Thus, for instance, 

F.»(z), which is of order m and degree 0, can be estimated with a al 
term as in equation (9) but with error term o0(1). 

We might add in conclusion that to a certain extent the behavior 
characterizes the theta function. For, if a theta function satisfies the 
relation f(z) ~ (z— p/q)’, for s real, as z—> p/q in a Stolz neighborhood, 
then f(z) is of degree 0 and s = —2m, where m is the order of f(z). This 
would follow immediately if we knew that f(z) was of finite degree, since 
then only the degree zero would permit of a behavior which is a power of 
(z—p/q). To see that f(z) is of finite degree, note that a wide enough 
Stolz neighborhood about p/q becomes, in the z’-plane, a region wide enough 
to include the infinite portion of any vertical strip of width 1. Thus, how- 
ever 2’ may approach oo, f(z’) is smaller than some power of (z—p/q), 


r 
i § 
9 


412 HARVEY COHN. 


which, in turn, is smaller than exp— 2ziz’. This proves that f(z) must 
be of finite degree. 


8. Dissection of the neighborhood of an irrational number. The 
change of variable, which served as a deus ex machina in the previous section 
on rational singularities, can be applied to the irrational singularities with 
slight modifications.® 

Our starting point is the simple continued fraction for an irrational 
number 6. We shall use a few elementary properties of irrational numbers, 
which can be found, e.g., in Perron [4]. The expansion of 6 into a simple 
continued fraction is written as (a We denote 
by pn/qn, the convergents. A connection between the 
denominators of the convergents and the degree of accuracy of the approxi- 
mation of 6 by pn/Gn is given by 


(10a) < (—) "qu? Gn) < (Ana + 1); 
and the growth of the denominators of the convergents is estimated by 
(10b) ner < < + 1. 


We are now ready for our major definition: We say that the complex 
number z belong to the convergent pn/qn of the irrational number 6 when 


S Qn. 


THEOREM. If we let the Stolz neighborhood at @ be of slope =A, 1.¢., 
if we let it lie within the angular region 32 =2| Re (2z—6)|, then under 
the transformation 2’ = (p’z— q’)/(— qn2 + Pn), (Pnp’ — Qn’ = 1), we find 
that a number z belonging to pn/qn with respect to 8 goes into a number 7 
whose imaginary part can be estimated by 


(11) 4 +1) (2 +A)/A> > + exp — 2/A. 


Proof. Let us first observe that the set of points belonging to n/n 
and lying in the Stolz neighborhood has a non-euclidean diameter of less 
than 2/A + 2 log qnsi/Qn. To see this, connect any two points of the region 
by a path consisting of two segments, one horizontal (of non-euclidean 
length < f dx/y = 2/A) and one vertical (of non-euclidean length S f dy/y 


= 2 log 
Next perform the transformation 2’ = (p’z—q’)/(—qnz+ pn). Al 


5 The methods presented in Sections 8 and 9 are a generalization of those of Hardy 
and Littlewood [3]. 


& 


SOME DIOPHANTINE ASPECTS OF MODULAR FUNCTIONS. 413 


least one point belonging to pn/qn must go into a point with 32’ > tana. 
For, as previously noted, | 6 — pn/qn | < Gnsi7'Qn7*._ Thus the Ford circle [1], 
drawn tangent to the real axis at pn/qn, and having radius dy,1:-*¢n-?, must 
contain in its interior some points belonging to pna/qn, such as 0 + ni" Qn” 
whose imaginary part lies between gn? and @nii-*. But the interior of this 
Ford circle (which we shall denote, for convenience, by Cp,/q,(@ns1) ) Clearly 
goes into the portion of the z’-plane with ¥z’ > $anu. 

We can now prove the second inequality in (11). The presence of a 
point 2’) above the line 32’ = $an,, excludes the presence of a point below 
the line 32’ =h, if we choose h small enough, (by virtue of the boundedness 
of the non-euclidean diameter of the set of points belonging to pn/qn and 
hence the boundedness of the z’-image). We need only take h such that 


f dy/y = 2/A + 2 log dnii/gn. This exceeds the non-euclidean diameter 
h 


in question. Thus we may take h as $(gn/qni1)? exp — 2/A, which leads to 
the inequality in question when we note that @n/qni1 > (@nvi +1)7. 

To obtain the first inequality in (11) we first construct a Ford circle 
about pn/qn so small that it excludes all points of the Stolz neighborhood. 
The circle of radius A(2 + A)-? | will suffice for that purpose. 
For this circle is determined by the condition that its circumscribed square 
exactly touches one of the lines y=A|a2—6]|. This circle clearly contains 
the smaller circle 


Con/an( + 1) (2 + A)/A), of radius A(2 + (Gnar + 


But since no points of the Stolz neighborhood lie inside this circle, it follows 
that in the z’-plane no images lie on or above the line 


(2 + 1) Q. E. D. 


We conclude this section by observing that under the previous conditions, 
z belonging to pn/gn and lying in a Stolz neighborhood of @ of slope no less 
than A, the quantities z — 0, —qnz+ jn, etc., are interrelated. First of all 


(12a) + 1)? Qn? << | < (1 +A) qn. 


This follows from the relations > 32 Sqn”, and | Re 
Likewise, 


(12b) A(1 + (Gnas — Qn? < | %— Pn/Gn | < (2 +A) qn. 


For the distance from pn/qn to the nearest of the two lines y= + A(x — 8) 
is pn/gn | (1 + A?) 4, which exceeds A(1 + On 


414 HARVEY COHN. 


the other hand, the distance | z—pn/qn| is less than the sum |z—6| 
+ |@—pn/qn|. This justifies the relation (12b). Dividing (12a) by 
(12b), we obtain the final estimate 


(12) +A) (dn +1)? 
< + pn | ?/|2—O| < (2 +A)? (Quer + 1)?. 


9. Behavior of theta functions near irrational singularities. We let 6 
be an irrational number whose continued fraction has bounded partial 
quotients. Such numbers 6 are known to form a non-enumerable set of 
measure zero, containing all real quadratic irrationals among others. We 
were interested in the function G,,,(z) since some of the singularities of 
individual terms of its series occur at real quadratic numbers, but we can 
just as well carry out the analysis for the larger class of numbers referred 
to above. 

We consider a theta function of finite or infinite degree, without poles, 
and of order m. Under the change of variable 2’ = (p’z — q’)/(— qnz + Pn), 
where p’pn—q’qn=1, we obtain: f(z) =f(2’)(—qnz+ If z 
approaches @ in a Stolz neighborhood of slope less than A and if 6 has 
bounded partial quotients, then (— qnz + pr)*> >< (2— 8) where z belongs 
to pr/qn- This follows from inequality (12). From inequality (11) we see 
that $2’ lies within certain bounds making f(z’) = O(1), if z belongs to pn/qu. 
Thus f(z) =O(z— 6) for theta functions of order m and without poles. 

If we know that f(z) has no zeros as well as no poles, we could write 
f(z’) <1, in the above paragraph, whence f(z) ~ (z—6)-". Applying these 
results to Gsm(z), (theta functions without poles and of order 2m), we find: 


(13) Gom(Z) O(z— 6)-", m= 2, 
and 
(13a) Gy2(z) (2— 


where 6 is an irrational number with bounded partial quotients, and z—# 
in a Stolz neighborhood. 


10. Characterization of a real number by the corresponding singu- 
larity. We shall see in the concluding paragraphs to what extent the 
behaviors at real singularities determine the singularity and the function. 
First we shall see that the behavior near irrationals with bounded partial 
quotients is seldom better than a large O estimation. In fact the estimation 
(13) can not be improved to the type (13a) unless m = 6. 


I 
| 
ob 


SOME DIOPHANTINE ASPECTS OF MODULAR FUNCTIONS. 415 


THEOREM. If 6 is an irrational number whose partial quotients a, do 
not tend to infinity with n, then a sufficiently wide Stolz neighborhood at 6 
will include infinitely many points equivalent under the modular group to a 


preassigned point of the upper half plane. 


(From elementary considerations this infinitude of points would have 
to converge to 6). For the proof of this theorem, consider the points 
An = 0 + which belong to pn/qn. Then if runs over infinitely many 
bounded partial quotients, the images A’, of A, under the transformation 
+ pn), Where pnp’ — =1, will lie in a hori- 
zontal strip @ by (11), with A taken as (say) unity. If z* is the preassigned 
point of this theorem, let p be the smallest radius for which the union of 
circles about z*,z* +1,2+2,-:-- of non-euclidean radius p covers the 
whole strip @. Then if we draw about the points A, circles of non-euclidean 
radius p, we should still be able to include these circles in a sufficiently wide 
Stolz neighborhood. In fact, if we take instead of A, a continuously moving 
point z= 6-+ i», » >0, then as » approaches zero continuously, the non- 
euclidean circles of radius p about this moving point sweep out a sector at 6, 
owing to the invariance of non-euclidean distance under homothetic trans- 
formation of the z-plane about 0. Q. E. D. 

Thus we can not replace (13) by Gomn(z) ~ (2— 6)" unless m = 6, 
since otherwise, as noted in 5, Gs»(z) has zeros. In fact, if 0 has bounded 
partial quotients, we can have f(z) ~ (2 — 0)" in any Stolz neighborhood 
of 6 if and only if f(z) is without zeros and poles, which, as noted in 5, 
means that the following holds: 5 


(14) f(z) = Gro(2)*exp g(J(z)), 
where g(w) is an entire function of w and k is an integer. 


We can supplement the last result by noting that a Poincaré theta 
function f(z) can behave ~ (z2— 4)8, s real, as z—> 68 in any arbitrary Stolz 
neighborhood only if @ is rational and f(z) is of degree zero or @ is irrational 
with bounded partial quotients and f(z) is of the form given in equation 
(14) above. The proof of this last statement is somewhat tedious, involving 
the examination of a number of special cases. It will be omitted as it 
requires no new methods. What we do is let z approach 6 vertically and 
show that if @ does not fall into one of the above categories, then the values 
taken by the theta function must vary too widely to permit such an asymptotic 
behavior. 

These last results are in contrast with a result of Wintner [7], who 


J 


416 HARVEY COHN. 


showed (essentially) that at almost any point G,.(z) has a limit along some 
path of approach and in fact these paths of approach can even be made ty 
lie in a Stolz neighborhood of preassigned arbitrarily small width. Wintner 
demonstrated this property for the Dedekind y-function,® which has a well- 
known expansion into powers of exp 2ziz, and whose behavior as a function 
can therefore be analyzed in the light of the behavior of the expansion 
coefficients. 

At any rate it is now apparent that the diophantine nature of a singu- 
larity of a theta function governs to a great extent the asymptotic behavior 
at the singularity, and even the function itself. 


HARVARD UNIVERSITY. 


BIBLIOGRAPHY. 


[1] L. Ford, “ A geometrical proof of a theorem of Hurwitz,” Proceedings of the Edin- 
burgh Mathematical Society, vol. 35 (1917), pp. 59-65. 

[2] R. Fueter, Vorlesungen iiber die singuldren Moduln und die komplexe Multipli- 
kation der elliptischen Funktionen, vol. 1, Leipzig-Berlin, 1924, p. 29. 

[3] G. H. Hardy and J. E. Littlewood, “ Some problems in diophantine approximation 
II,” Acta Mathematica, vol. 37 (1914), p. 230. 

[4] J. F. Koksma, “ Diophantische Approximationen,” Ergebnisse der Mathematik und 
ihrer Grenzgebiete, vol. 4, Berlin, 1936, Chapter IX, 2. 

[5] O. Perron, Die Lehre von den Kettenbriichen, Leipzig, 1913, Chapter II. 

[6] H. Poincaré, “ Fonctions modulaires et fonctions Fuchsiennes,” Muvres, vol. II, 
Paris, 1916, pp. 592-618. 

[7] A. Wintner, “ A property of the elliptic modular net,” Duke Mathematical Journal, 
vol. 12 (1945), pp. 451-454. 


* The proportionality of G.,(z) and 9(z)** was noted in 5 above. 


( 
’ 
I 
b 
| 
wl 
8c 


A PROBLEM OF PLANE MEASURE.* 


By G. G. Lorentz. 


1, Introduction. Let A be a plane measurable set. For a real 2» let 
A,, be the set of all y, for which the point (ao, y) belongs to A, that is, the 
projection upon the y-axis of the section z=, of the set A. A, is measur- 
able for all x except for a null set (that is, a set of zero measure), and we 
call the function * 


= P(x) = mA, 


the cross function of the set A with respect to the x-axis. The set A is not 
determined by the cross function P(x), but any non-negative measurable 
function P(2) is a cross function of a measurable set. The chief aim of 
this paper is to inquire into the conditions under which a pair of functions, 
P(x) and Q(y), is a pair of cross functions of a set A on the z- and y-axes, 
respectively, and if such is the case, whether A is defined uniquely. We shall 
consider plane sets A with finite measure pA < + ©. Then P(x), Q(y) 
are almost everywhere finite and integrable over (— 0,-+ 0). 

For a non-negative integrable function P(x) defined for —wo <2 
<-+ o there exists, according to F. Riesz,? a non-increasing rearrangement 
of P(x), that is, a function p(x), 0 << « < + o, equimeasurable with P(z), 
for which p(z,) = for Here two non-negative integrable 
functions p(x), P(x) defined on sets e, HE respectively, are called equi- 
measurable, if the x-sets defined by aS p(x) and «= P(x) have the same 
measure for every real « (Then also other similar sets, e.g. those defined 
by p(x) < aand P(x) < a, have the same measure. In particular, me = mE). 

Let *'y = p(x) and x=q(y) be two non-increasing functions defined 
for x > 0, y > 0, respectively. We set up the following conditions 


(1) 
(6) y>0. 


* Received April 15, 1948. 

*In the sequel mB always signifies the linear Lebesgue measure of a linear set B, 
whereas “A signifies the plane measure of a set A. 

*F. Riesz, “Sur une inégalité intégrale,” Journal of the London Mathematical 
Society, vol. 5 (1930), pp. 162-168. F. Riesz considers functions on a finite interval. 


417 


418 G. G. LORENTZ. 


These conditions are satisfied, for example, and in fact reduce to identities, 
if p(x) and q(y) are inverses of each other. The functions = e~, 
— 2“ log (y/2) satisfy (1) and are not inverses of each other. We can now 
formulate our main results thus: 


THEOREM 1. Suppose that P(x), Q(y) are two non-negative integrable 
functions defined for —w In order that 
a measurable set A with cross functions P(x), Q(y) exists, it is necessary 
and sufficient that the non-increasing rearrangements p(x), q(y) of these 
functions satisfy conditions (1). 


THEOREM 2. The set A is determined uniquely modulo null sets by its 
cross functions if and only if p(x) and q(y) are inverses of each other. 


In 2 we collect some simple properties of equimeasurable functions and 
of functions satisfying conditions (1). In 3 we give proofs of Theorems t 
and 2. Finally, in 4 some other problems concerning cross functions are 
discussed. 

2. Lemmas on equimeasurable functions and on functions satisfying 
conditions (1). In the sequel p(x), g(y) signify non-negative, non-increasing 
functions defined for x >0 and y>0O respectively, whose integrals over 
(0, -+ ) are finite. For a function p(x) of this kind, we define p-'(y) to 
be constant x) for p(%—) Sy S p(ao +), if x is a point of discontinuity 
of p(x). On the other hand, if (a, 8) is an interval of invariability of p(z) 
such that p(x) = yo, we choose p(y) arbitrarily between « and B. Thus 


We start with some properties of functions p(a), q(y) satisfying 
conditions (1). Making r— o and y— oo in (1a) and (1b) and using 
(2) we obtain 


(3) roar — f 


Conversely, (1b) follows from (1a) and (3). To prove this, let F' be the set 
of 0<y<-+ for which the closed intervals = + ),p*(y—)] 


and J, =[q(y+),q(y—)] have common points. Clearly, the complement § 


of F is open in y > 0, and F is closed. Suppose ye F and let x be a common 
point of J, and J,. Then : 


ar 


1 
( 
¢ 
a 
al 
n 
a 
a 
ha 
m 


A PROBLEM OF PLANE MEASURE. 


Subtracting we obtain 


For the left-hand side ®(y) of this equality (1a) implies (y) 20 for ye F. 
Now consider a complementary interval e<u< of F. The difference 
p?(u) —q(u) is of constant sign there. For otherwise a point «<y< B 
would exist with the property that p*(uw) —q(u) changes sign in every 
neighborhood of y. Then p*(y—) 2q(y+), q(y—) 2 p'(y+), which 
contradicts the definition of («, 8). Thus ®(y) is monotone in every such 
interval and therefore ®(y) 20 there. The same argument applies to the 
complementary intervals for which or B=-+ 0, since ©(0+) 
= (+ 0) 0. This completes the proof. 
For equimeasurable functions we need the following lemma: 


LemMa 1. Suppose that ®(2), xe A and ®'(2’), 2’ € A’ are equimeasur- 
able functions, and that C is any set of real numbers. If BC A and B’C A’ 
are sets defined by ®(x2)eC and by ®'(2’) eC respectively, and if B ts 
measurable, then B’ is also, and mB’ = mB. 


Proof. If C is an interval, the assertion follows from the definition of 
equimeasurable functions. Hence, the same holds for any open set C and 
any closed set C. Let By CB be closed, bounded, and such that ®(2) is 
continuous on By. By Cy we designate the set of all values of @(z7) on Bo, 
and by B’, the set of all 2 with ®’(2’) eC. Co is closed, hence B’, is 
measurable and mB’,—mB,. Now we have (m+E is the inner Lebesgue 
measure of a set FL) 


mB = sup mBy = sup mB’) m-B’. 
First suppose mA = mA’ < + ©. Similar to the above we have 
m(A— B) = m-(A’ — B’), 
and since mB m(A —B) = mA’ = + m.(A’ — B’), 
mB = m-B’, mA — = m(A — B) = m.(A’— B’). 


That is, B’ is measurable and mB’ = mB. 

If mA =-+ o, we consider instead of A and A’ their subsets Ag, A’, 
(2>0) defined by | ®(x)| >a and by | ®’(2’)| >a respectively. They 
have a finite measure. Thus we get the measurability of B’A’,, and 
mBA,—=mB’A’,. The same relation for BA, and B’A’, follows by passing 


419 


420 G. G. LORENTZ. 


to the limit; and thus we obtain the assertion of the lemma, because ® and 
®’ vanish on sets of equal measure. 
From this lemma easily follows 


LemMaA 2. If ®(x), ®’(2’) are equimeasurable integrable non-negative 
functions defined on A and A’ respectively, then A, A’, except for two null 
sets, split up in classes of points K, K’ such that the corresponding function 
is constant on every class and that each class is a set of zero measure. Among 
the classes K and K’ a one-to-one correspondence & exists such that ® and @ 
have the same values on classes corresponding to each other. If a set of 
the form B=XK is measurable, the corresponding set B’=3XK’ is also 
measurable, and mB = mB’. If ®(x) is monotonous and A is an interval, 
then each class K is a single point. 

Proof. The set N of points xe A, for which ®’ does not take the value 
@(x), is a null set according to Lemma 1. We decompose the set A—WN 
in subsets B on which ®(2) is constant; they are in one-to-one correspondence 
with the analogous subsets B’ of A’— WN’ (N’ is defined similarly to N). 
These subsets B, B’ are classes K, K’ in the case they are null sets. If, on the 
other hand, mB = mB’ > 0 (these measures are necessarily finite), we define on 
B and B’ the functions ¢(x) = m[B-(— o, 2)], = m[B’- (— v’)]. 
Then the classes K, K’ are the subsets of B and B’ on which ¢ and ¢’ take 
the same values. It is easily seen by Lemma 1 that the classes K, K’ and 
the correspondence % thus defined have all the announced -properties. 


8. Proof of Theorems 1 and 2. 3.1. First of all we shall show: If 


conditions (1). (This is the first part of Theorem 1). 
Let A, be the set obtained by “ pouring” A on the z-axis, that is the 


We assert: 
(4) 


for every v=0, if mB Sv. In fact, let C be the set of points (x,y) 


Po(x) P(x) and, further, Po(z) Sv. Therefore Pe(x) S Min [P(2), 2] 
= Po,(z). An integration over (— ©) gives 


(5) S 


P(x), Q(y) are the cross functions of a measurable set A with pA < + , | 
and if p(x), q(y) are their non-increasing rearrangements, then p, q satisfy | 


“ordinate set” of P(x), defined by —wo cr<c+t+o,0SySP(z). A 
has the cross functions P,(2) = P(a) and Q,(y), and Q,(y) is non-increasing. | 


with ye B and C, be the set of points (z,y) eA, with OS ySv. Then | 


( 
a 
( 
( 
tl 
a 
p 
( 
| 


A PROBLEM OF PLANE MEASURE. 421 


Thus (4) is established. From (4) it follows immediately for the rearrange- 
ment g(y) of Q(y) 


Now we pour A, on the y-axis, and get a set A. with the cross functions 
P.(z) and Q:(y), which are both non-increasing. A, is bounded by the axes 
and by the curve y=P.(z) =Q,""(z). Applying (6) to this second 
“pouring ” we get 


JS J 0. 


According to 2 and hy making use of »A, = pA, this becomes 


(7) 

0 
6) and (7) imply (1b), and in a similar way (la) may be proved. 
( 


3.2. Suppose that p(x), g(y) are functions satisfying conditions (1). 
We shall show that a set A with the cross functions p, q exists, and further 
that there are two such sets, essentially different from one another (that is, 
not equivalent modulo null sets), when the relation p= q is not fulfilled. 

Let D be the set of all points (x, y), x > 0, y > 0 which satisfy + S q(y) 
and <= p(y) ; and let D* and D~ be defined by g(y) <2 < p(y) and by 
p(y) <a<q(y) respectively (we only take into consideration points of 
continuity y of p* and q). The sets D*, D are open; their projections on 
the y- and the z-axis are disjoint open sets. For any plane set A we denote by 
A(yo) the part of A which is situated in the strip 0 <y< yo. By (1b) we 
have then 


(8) pD-(y) SpD*(y), 0. 
We wish to establish the representations 

(9) 

where N*, N- are null sets and R,*, Ry are open “ net-squares ” of the kind 
<a < 2"(1+1), 29m <2*(m+1), lm, n—0,1,2,--> 


such that R*; and R;- are congruent, and that R;- lies above and on the left 
of R;7. 
For we can represent D- in the form 3R; + N, where R; are open net 


| 
f 
| 

| 
F 


422 G. G. LORENTZ. 


squares. The square A, lies in a strip « < y < B, which contains no points 
of D*. Since, by (8), »D*(«) = pD-(B), there is a y, OS y < a for which 
D*(a) — D*(y)] The open set D*(«) —D*(y) lies below and on 
the right of R,; thus, representations (9) exist for D+(«) — D*(y) and R,, 
The same process, applied to R., R;,- - - establishes the result. We shall 
assume, as we may, that in (9) the squares are arranged according to their 
size. 

By means of (9) we prove assertion 2 in the following way. The set 
Ay = D+ D* has the cross functions p(x) and p(y). We obtain the set 
A of the assertion by shifting certain squares of A, vertically upwards on to 
an empty place. Thus the new y-cross function becomes the same as for 
D + D*, that is g(y), while the z-cross function remains unaltered. 

These shiftings are defined inductively. Suppose that the 1, 2,-- -, 
(k—1)-st steps have been taken, leading to the set Ax. Further let 
a<y<f8 be the strip in which R,* lies and y< <8, (y=8) the strip 
of R,. Put 1=B—a=—s—y. Then 


Qa.(B) 2 +12 +1 
Sa(y) 


The section (Ax-1)g of Ax. contains at least [q(y)/l] + 1 net-intervals of 
length 1 (because (Ao)g was a single interval from which only net intervals 
of lengths =/ were removed) ; in the same way the section (Ax-1)¥ is con- 
tained in [q(y)/l] + 1—1< [q(y)/l] +1 such intervals. Therefore, there 
is in the strip «< y < f a full square RF of Ax-,, over which an empty place 
in the strip y < y < 8 exists. The k-th step consists in the vertical displace- 
ment of F# into the strip y << y <8. This displacement thus carries R from 
a strip containing points of D* (and no points of D-) into one containing 
points of D- (and none of D*). 


The set A thus formed has the required cross function, 


Qa(y) Q4(Y) — 3Qr,*(y) + (y) 
= — (y) + (y) = = 


If p(x) =q-*(x) does not hold for all xz, then there are two different 
sets A, A’ with the cross functions p(x), g(y). In fact, at the first step 
of the above construction, we could shift either R,* itself upwards or the 
square of the same horizontal strip lying exactly under R,-. Consequently 
there are sets A with cross functions p, q containing no points of R,* and 
such containing R,-. Interchanging the role of x and y, that is of p and q, 


an 
Co 
has 
(1 
wit 
set 
str 
(p 
the 
In 
Ac 
ain 
eve 
the 
A’ 
sets 
res 
diff 
pos 
rea 
fu 
p(3 
tha 
(1 
For 
axe 
the 
As 
of 
Yy-Se 
(1 


A PROBLEM OF PLANE MEASURE. 423 


and of #,* and R,-, we see that there exists also such a set A containing R,* 
Consequently there are at least two different sets A. 


3.3. The set A with the cross functions p, q constructed in 3. 2 clearly 

has the form 

oO co 
(10) 
with null sets NV, N and “elementary sets” H;, £;. (By a plane elementary 
set we understand the product of two measurable linear sets; in the con- 
struction above these were squares.) 

Let A be a set of the form (10) with cross functions p(x) and Q(y) 
(p(z) being non-increasing) ; if moreover P(x) is equimeasurable with p(z), 
then there is a set of the form (10) with cross functions P(x) and Q(y). 
In fact, suppose that B, B’ are sets on which p, P respectively do not vanish. 
According to Iemma 2 of 2, a class K’ of points 2’ of B’ corresponds to 
almost every xe B. We project the set Az on the vertical lines through 
every such 2’ eK’. This process, which may be designated by %, leads to 
the required set A’. For A’ has the cross functions P(x) and Q(y). Finally, 
A’ is measurable, being again of the form (10) (for $ preserves elementary 
sets). 

The second part of Theorem 1 is now obtained by applying % twice. 

Finally the necessity of the condition p(x) =gq (xz) in Theorem 2 
results. For if this condition is not fulfilled, there are two essentially 
different sets A, A with cross functions p,q. Then P4-i(x) > 0 on a set of 
positive measure. Since this latter condition is preserved under $8, the above 
reasoning shows that there are two essentially different sets with cross 
functions P and Q. 


3.4. In order to prove Theorem 2 it remains to show that, if 
p(x) = q-*(x), only one solution of the problem exists. We first observe 
that the set A, constructed in 3.2 and 3.3, has the following property: 


(11) For almost all y,, y2 the inequality Q(y:) S Q(y2) implies Ay, C Ay. 
For (11) holds for the set A bounded by the curve y= p(x) and by the 
axes, with which we started the construction; and (11) is preserved under $. 

Suppose now that A is a set with cross functions P, Q, and that A has 
the property (11). We shall show that no set A’ essentially different from 
A and with the same cross functions can exist. For such a set the measure 
of A— A’ would be positive. Therefore, we can choose an e >0 and a 
y-set B of positive measure so that 


(13) m(A,y— A’y) ye B. 


424 G. G. LORENTZ. 


We may assume that Q(y) is continuous on B, and we choose a point yp of 
density 1 of this set. Let C be the set of (a,y) defined by ze A,, 
—eocy<+o. Then 


pCA = J p(x) dx = pCA’ 
Ayo 
and therefore 


+00 +00 
(13) f mCA,dy = mCA’,dy. 


If Q(y) = Q(yo), we have A, CC in accordance with (11) and therefore 
mCA, = Q(y) = mA’, = mCA’,; 


and if Q(y) = Q(y), 
mCA, = mC = mCA’,. 


Hence mCA’, = mCA, almost everywhere, and (13) implies mCA’, = mCA, 
almost everywhere. We shall show that this contradicts (12). Let «>0 
be arbitrary; we choose § > 0 such that 


| mA, — mA,, | < yeB, |y—yo|<s 


Then mCA, > mA,,—<«/2 for these y; moreover mD,y < mAy,, + €/2, if the 
set D is defined by re Ay+ Ay, y arbitrary. On the other hand, because 
of AyC D and (12), 


m(D — A’,) —m(D—DA’y) =; 
hence 
mC A’, = mDA’, = mD — < mAy, — €/2 < mCAy 


for all y from a certain subset of B of positive measure, in contradiction to 
what was proved above. 


4. Similar problems. 4.1. Up to now we have assumed that the cross 
functions are integrable, that is, that the set A has a finite measure. But 
the problems treated have also a sense in the case prA=-+ 0. However, 
they cannot be dealt with by the method applied before, because a non- 


integrable function P(a) does not generally possess a monotone rearrange- 
ment. We shall deal with a special case to show. that matters are now 
quite different than if uA < + o. 


THEOREM 3. Suppose that P(x), Q(y) are two continuous increasing 
functions, defined forx=0, y= 0 respectively, and vanishing at the origin. 


of 


re 


Ay 


he 


to 


A PROBLEM OF PLANE MEASURE. 425 


Then there eatsts exactly one set A, bounded by two curves of the form 
y=f(r), y= 9 (2), f(x) S with continuous increasing functions f(x), 
g(x), vanishing for x0, and with cross functions P, Q. 


In other words there is exactly one pair of functions f, g of the kind 
mentioned satisfying the system of functional equations 


g(x) —f(z) = P(2), 


f(y) = Q(y). 


Proof. We put 
(15) fo(z)=90, go(x) = P(2) 


(16) + OY), = + P(x), n= 1,2,---. 
Except for fo, all these functions are increasing and continuous; they vanish 
at the origin. From (16) follows easily: if 2 fn+(r) for all 
then also = fn(2). Therefore the limiting functions f(r) = lim f,(z2), 
=lim gn(x) exist. They are finite, since g,(z) P(x), and con- 
tinuous, on account of gn(2’) — gn(@) S P(2’) — P(x) forxSv. Further- 
more they are monotone increasing. Making n— o in (16) we see that 
f, g satisfy equations (14). 

Finally f, g is the only solution. For let f’, g’ be another solution and 
f not identical with f’, say f’(a) —f(a) >0 for ana>0. Let 2 be the 
point of [0,a] where f’(z)}—f(z) reaches its maximum. We put 
Yo=f(20), (Xo), further 


(Yo) =f (Yo) —O (Yo) = — Q(Yo), 
a’ = 9'" (Yo) = %— Q(y’o); 
then 0 < 2’; < 41 < 2%. Hence 
f (2:1) = — = yo— P(a), 
(21) P(2'1) > yo — ; 


and therefore 


contradicting the definition of 2». 


4.2. We may ask which properties the functions P(x), Q(y) must have 
in order that a corresponding convex set A exists. A necessary condition is 


12 


and 
0 
se J 
SS 
g 
i 


426 \ G. G. LORENTZ. 


that the functions P(x), Q(y) be conver. But this is not sufficient, as is 
shown by the following example. Let 
for 0 =2z=1, 
P(r) = Q(z) = (2—2)/2 for 
0 otherwise. 

- If a convex region A existed for P, Q, it would have boundary points on each 
side of the square OS y= 2. But these boundary points can 
only lie at the edges of the square. For if, e. g., (0, yo), 0 < Yo < 2, were a 
boundary point of A, then A, at least in the neighbourhood of (0, yo), would 
contain the interior of the angle (2,0) (0, yo) (2, 2), and this interior has an 
x-cross function >2z> P(x). Because of the symmetry we may assume 
(0,0) © A, then also (2,2) eA. So A is bounded by two curves y = f(a), 
y=g9(x) (f(z) Sg(2), f, g monotone) joining (0,0) and (2,2). By 
conclusions similar to those in 4.1 we can show that exactly one pair of 
monotone functions f, g exists for 021 and OSy1, satisfying 
conditions (14). Such a pair is f(z) = (173—1)2/4, g(x) = (17+ 1)2/4. 
By continuation, this is still true in some intervals O[¢7S «2, OS ya, 
# > 1. Considering in the same way the course of the curves in the neighbour- 
hood of (2,2) we find 2—g(x) = (17!—1)(2—2)/4 in the interval 
1=2z5S 2, contradicting the preceding value of g(x) forl<a<a. 


4.3. The cross function of a plane set A with respect to a direction z 
is the cross function of A with respect to any axis of this direction. A set 
A is not always uniquely determined by its cross functions with respect to 
two directions. The same holds for any finite system °°, O 
directions: Two disjoint bounded sets A, B exist possessing the same cross 
functions with respect to these directions. For n =1 the assertion is trivial, 
and in the general case it is proved by induction. Suppose that An, Bua 
are two disjoint bounded sets with equal cross functions with respect to 
We displace An. + Bn, parallel to itself in the direction 
perpendicular to x, so that the new set A’n-, + B’n_, has no points in common 
with Bn». Then An=An-1 + Bn, and By = A’n + Bn, are dis- 
joint and have equal cross functions with respect to 2,° * +, @n. 

Further possible questions are these: Is a measurable? set uniquely 
determined by its cross functions with respect to all straight lines? Is a set 
necessarily convex if all its cross functions are convex? 


TUBINGEN, GERMANY. 


8 For non-measurable sets the answer is negative. There are non-measurable sets 
for which all cross functions vanish (W. Sierpinski, Fundamenta Mathematicae, vol. l 
(1920), pp. 112-115. 5 


| 
| 
f 
0 
0 
f 
Pp 
¢ 
to 
co 
th 
( 
Ww 
to 


DIFFERENTIAL GEOMETRY OF A GENERAL MOTION IN THE 
COMPLEX PROJECTIVE LINE.* 


By P. O. BEtt. 


1, Introduction. The purpose of the author in the present paper is to 
study the complex projective differential geometry of a general motion of a 
point in the complex line. Such a motion may be conveniently portrayed 
as a motion of a point along a curve in the Cauchy plane. The funda- 
mental group of this geometry is the group of complex projective trans- 
formations of the points of the complex line and the methods are the usual 
methods of projective differential geometry. The methods employed in parts 
of 2 and 8 are similar to those which E. Cartan [1, pp. 28-42] has used in 
his study of the geometry of the motions of a point in the complex line 
subject to the group of real projective transformations. The moving reference 
frame employed by Cartan owes its geometric significance to the preservation 
of the Poincaré half-plane [1, pp. 33-34] by the real projective transformations 
of the points of the complex line. The present study calls for a local reference 
frame (for the definition of projective homogeneous coordinates) which 
possesses geometric significance with respect to the larger group of complex 
projective transformations. Such a reference frame attached to a moving 
point of a general curve of the Cauchy plane is introduced and geometrically 
characterized in the present paper. 

Throughout the present paper the point of view adopted is that of kine- 
matics. The motion of a point z(¢) will be classified according to its relation 
to time ¢ as independent parameter. According as the real and imaginary 
parts of the Schwarzian derivative {z,¢} satisfy one of the following 
conditions : 


(1) I40, (2) I[=0,R<0, (3) I=0,R>0,or (4) [=R—0, 


the motion of z at an instant ¢ will be called (1) loxodromic, (2) hyperbolic, 
(3) elliptic, or (4) parabolic, respectively. Each type of motion is associated 
with a homography of a type referred to by the same name as follows: A 
curve Cy» is uniquely determined by the following properties: it is tangent 
to C at z(t)), and the non-homogeneous coordinate w of its generic point 


_... 


* Received March 30, 1948; revised January 6, 1949. 


] 
427 


428 P. 0. BELL. 


w(t) satisfies the condition {w, t} = {z, t}:,. A displacement of points of C,, 
produced by an arbitrary parametric translation t = ¢’ +h can be produced 
by a homography of the type named. These four categories of homographies 
are entirely equivalent to the classical ones which have the same names 
(cf. Osgood [2, pp. 262-270]). The kinematic characterizations of the types 
(2), (3) and (4) reduce to those of Cartan [1, pp. 39-42] upon restricting 
the group to be that of the real projective transformations of a point in the 
complex line. The curves of most general type which can be mapped upon 
themselves by a transformation (other than the identity) of the group are 
loxodromes. If the fixed points of the mapping consist of a finite point and 
the infinite point, the loxodrome becomes a logarithmic spiral. Circles 
appear as special loxodromes which correspond to real values of a certain 
invariant function. Since the loxodromes appear as basic curves in the 
classical theory of the loxodromic homographies when viewed from the stand- 
point of the theory of functions of a complex variable [2, pp. 261-262] it is 
not surprising that they play an important role in the present theory. 
Various authors have employed the invariants known as inversive arc 
length and inversive curvature in their studies of inverse differential geometry 
of plane curves, notably Mullins (in a Columbia dissertation in 1917), H. 
Liebmann [3], T. Kubota [4], J. Maeda [5], F. Morley [6], and B. C. 
Patterson [7]. Inversive arc length of a curve C is a real parameter p such 
that at a generic point z(p) of the curve the imaginary coefficient of the 
Schwarzian derivative of z with respect to p is constantly equal to 1. The 
real part of this Schwarzian derivative is the differential invariant known as 
inversive curvature. These invariants would serve as a basis for the study 
of the intrinsic differential geometry of plane curves under homographies, 
since the group of homographies is a subgroup of the inversive group. How- 
ever, since the inversive arc length vanishes at all points of a circle and at 
points of a curve where the ordinary curvature is constant, it is highly 
desirable for the purposes of a kinematic study to introduce a metric 80 
characterized that except at a parabolic point of the curve the associated 
arc-length does not vanish along the curve. Such an arc length called the 
homographic arc length of a curve C will be defined in terms of an absolute 
(£,7) geometrically determined at each non-parabolic point of the curve. The 
element of homographic arc length also possesses advantages of symmetry 
and in addition actually characterizes a euclidean geometry with respect to 
homographies. Along a “non-euclidean straight line” in the half-plane of 
Poincaré [8, pp. 7-8] on which the Poincaré metric is defined, the element 
of homographic are length becomes identical with the element defined by the 


DIFFERENTIAL GEOMETRY OF A GENERAL MOTION. 429 


Poincaré metric. The geometric characterization of the element of homo- 
graphic are length of C at z(t) will be expressed very simply in terms of a 
certain angle @ and its differential d@ whose determinations depend upon the 
particular type of motion of the point z(t) at the instant t. 

The homographic distance between two points P, Q with respect to an 
absolute (f,7) is defined by 


PQ =| log (&PQ)|, 


in which the right member is the absolute value of the logarithm of the cross- 
ratio of the four complex points {, y, P,Q. A “straight line” with respect 
to this metric (which will be called an s-line) will be defined to be a curve 
whose homographic arc length between any two of its points is equal to the 
homographic distance between these two points. An s-line is found to be a 
curve which can be mapped upon itself by a homography whose double points 
are £ and 7; the general s-line is a loxodrome, special cases are logarithmic 
spirals, arcs of circles having ¢ and y as endpoints, and the circles which are 
orthogonal trajectories of these arcs. The homographic metric defined with 
respect to a given absolute f, 7 and the plane geometry of this metric are 
found to be euclidean; a simple conformal mapping of the s-lines of the 
plane upon the ordinary euclidean straight lines of the plane is obtained. 


2. Normal coordinates and the Schwarzian derivative. Let C denote 
a curve described in the Cauchy plane by a moving point z whose homo- 
geneous complex projective coordinates are taken to be (1,2(t)) where ¢ is a 
real parameter. The coordinates of z and those of any projective transform 
of z satisfy a second order differential equation of the form 


(2.1) 2” + pz’ =0 


in which p.is a function of ¢ and the accent denotes differentiation with respect 
tot. Curves projectively equivalent to C as well as the curve C are, therefore, 
integral curves of (2.1). To normalize homogeneous coordinates of the 
point z let a proportionality factor A be introduced by the formula z = Aw. 
Equation (2.1) assumes the form 


hw” + + + (A” + = 0. 


Let \ be chosen so that the coefficient of w’ vanishes, that is A = exp(— 3 f pdt). 
With this choice of A the differential equation of the curve reduces to the form 


(2.2) 


8 
s 
s 
e 
q 
n 
e 
8 
h 
ie 
le 
1S 
y j 
S, 
at 
30 
d 
e q 


430 ; P. O. BELL. 


where r is an invariant defined by r—=— (2p’+ p?)/4. It may also be 
shown that r may be expressed in terms of the non-homogeneous coordinate z 
by the relations 

Qr == {2, t} 2/2’ — 


that is, 27 is equal to the Schwarzian derivative of z with respect to ¢. More- 
over, the proportionality factor A can be expressed by the formula A = (2’)+; 
so that the normal projective homogeneous coordinates of the point z are 
given by the relations = 1/V 2’, wo? = 2z/V 7. 

The author has shown in a former paper [9, p. 490] that the Schwarzian 
derivative can be expressed in terms of euclidean curvature y and euclidean 
arc length s by means of the formula 


(2.3) {z, t} = (idy/ds + y?/2) (ds/dt)? + {s, t}. 


On making use of the formula for interchanging the variables in the Schwar- 
zian derivative 
{s, t}dt? + {t, s}ds? = 0, 


equation (2.3) may be made to assume the form 
(2. 4) {z, t} = (tdy/ds + y?/2 — {t, s}) (ds/dt)?. 
The real and imaginary components of r, 
R= (7?—2t,8}) (ds/dt)?/4, = (dy/ds)(ds/dt)?/2 
are, themselves, absolute invariants of the complex projective transformations 
(2. 5) = cw? + = aw? + bo, 


in which the coefficients are complex constants, the non-homogeneous coordi- 
nates of z and Z being defined by z= w?/o' and Z =?/0", respectively. 
The following are, therefore, invariant equations 


(72/2 — {t, 8}) ds? = (72/2 — {t, 5})ds*,— dyds = dyads 


where y,7 and ds, ds are corresponding euclidean curvatures and arc length 
elements of a curve C and its transform © by a transformation (2.5). The 
transformations (2.5) may be written in the form of the homographies 


(2. 6) Z = (az + b)/(cz+ d) 


in which a, b, c, d are complex constants. 


| 
W 
: 0 
C 
t 
te 

t 
0 
Or 
of 


DIFFERENTIAL GEOMETRY OF A GENERAL MOTION. 431 


3. Homographic mappings of curves upon themselves; fixed points 
of the mappings. A parametric mapping of a curve C upon itself is deter- 
mined by two relations of the form 


(3.1) Q(t)=o(f), 


where ¢ is a single valued function of ¢. The curve C may be denoted by C; 
or C7 according as w(t) or w(f) is regarded as the generic point, respectively. 
Certain curves exist each of which possesses the property that a parametric 
mapping function ¢(¢) exists which maps an arbitrarily chosen point o, of 
the curve upon another arbitrarily chosen point Q, of the curve and simul- 
taneously maps the curve upon itself in such a manner that the mapping 
can be obtained by a homography 


(3.2) (a2(t) +)/(c2(t) + d) 


where z = w*/w. These curves will now be determined. 


The equations 
(3.3) d?w(t)/dé? + =0, d?Q(t) /dé? + r(t)a(t) =0, 


are satisfied identically in ¢ and t, respectively. The first of these is merely 
the differential equation of C, and the second expresses the condition that 
the curve C7 be equivalent to C; with respect to a transformation (3. 2). 
On substituting for d?Q/dt? in (3.3) the right member of the relation 


d?Q/dt®? = (d?w/dé*) (dé/dt)? + (dw/dé) (d?t/dt?) 
one obtains the equation 
d?w(é) /dé? (dw/dé) (d*i/dt?) (dt/di)* + r(t) w(t) (dt/dt)? = 0. 


Comparison of this equation with the first equation of (3.3) yields the 
relations 


(3. 4) d2i/dt2=0, (2) = r(t) (dt/dt)2, 


because the points w(?), dw/di, d’w/df? are assumed to be linearly independent. 
From the first of these relations it follows that the mapping functions ¢ are 
of the form 


(3. 4) i=kt+h, k, h = const. 


| > 

be 

4. 

Te 

in 

in 

| 

| 


432 P. 0. BELL. 


The second relation therefore becomes 


(3. 5) k*r(kt+h) =r(t). 


The constant h may be arbitrarily selected, since no further restrictions are 
imposed upon / than those given by equations (3.4). To determine k, hold t 
and & fixed and allow fA to vary. It follows from (3.5) that r(kt+ h) isa 
constant equal to r(t) and k—=+1. The curves whose parametric mappings 
upon themselves can be obtained by homographies are therefore the r(t) 
= constant curves, and the mapping functions are of the forms t= +t +h, 
These mapping functions serve to define what will be called direct and indirect 
parametric displacements of the points of a curve C' according as the sense 
of variation of the parameter is maintained or is reversed, respectively. It 
is clear now that an arbitrary pair of points w), Q, may be put in correspon- 
dence by a direct (or an indirect) displacement with a suitable choice of the 
constant h. 

Let r—=R-+ iJ. A transformation which maps an r = const. curve upon 
itself will be called (i) loxodromic, (ii) hyperbolic, (iii) elliptic, or (iv) para- 
bolic, according as (i) J ~0, (ii) J—=0,R <0, (iii) J —=0, R > 0, or (iv) 
I= k=O, respectively. If ¢ is taken to be time, a kinematic interpretation 

of the description of a curve by a generic point w(¢) is obtained upon con- 

sidering the parametric displacements of type (i) to (iv). A motion along 
an r = const. curve during a lapse of time may therefore be called loxodromic, 
hyperbolic, elliptic, or parabolic, according to the stipulations upon r stated 
above. 

Since the imaginary component of r is given by I = (dy/ds) (ds/dt)?/2 
and ds/dt ~0, it follows that dy/ds =0 if I=0. Hence, the I(r) =0 
curves are circles. 

The most general r = const. curves are integral curves of a differential 
equation of form (1.2). The homogeneous coordinates of a general point 
of such a curve are therefore given by the form 


(3. 6) wo = 4 


These curves are, clearly, projectively equivalent to the curve defined by 
w' = 1, w? = e?V"t, that is, the curve defined in a non-homogeneous coordinate 


by 
(3.7) = erivrt, 


It follows that the r= real const. curves are circles. 


ig 
, 
§ 
6D 
a 
| 


fe 


DIFFERENTIAL GEOMETRY OF A GENERAL MOTION. 433 


If r =const., but J 0, it follows that the two square roots of r may 
be written as Vr, — Vr where Vr =p-+ Ww, v540. Equation (3.7) may be 
written in the form 


(3. 8) log z = — 2vt + int. 
Upon substituting 6/24 for ¢ this equation assumes the form 
(3. 9) log z = — + 10, v/p = const. 


The curve represented by this equation 1s a logarithmic spiral, and the curve 
(3.6) of most general type which is projectively equivalent to it 1s a non- 
degenerate loxodrome. 


An IJ =0 curve may be mapped upon itself by a transformation of one 
of the last three types, the type being determined by the nature of the para- 
meter used in the displacement. 


THEOREM. A curve obtained by transforming a given non-degenerate 
logarithmic spiral by a homography (2.6) in which either c or d ts equal to 
zero is a logarithmic spiral which can be superposed upon the original curve 
by means of a euclidean rigid displacement. 


To prove this theorem it is sufficient to determine the effects of (1) an 
expansion and rotation Z = az, and (2) an inversion Z = 1/z. A translation, 
of course, merely displaces the curve. As z describes a given logarithmic 
spiral defined by 


(3. 10) log z = k6 + 16 k = const. ~ 0, 


the locus of a point Z defined by Z = az, where loga = «+ i, is determined 
by the equation log Z =kO+a-+-i(6+ 8). This may be written in the 
form log Z = k@ + i@+i(8—a/k), where +a/k. The locus thus 
defined is therefore that of the given logarithmic spiral rotated through the 
angle 8 —a/k. Hence the transformed spiral can be superposed upon the 
original spiral by a rigid euclidean displacement. As z describes the locus 
(3.10) the point Z given by Z = 1/z moves along the same curve, since the 
locus of Z is represented by the equation 


(3.11) log Z = — log z = — kd — 00, 


the angle of Z being the negative of the angle of z. This completes the proof. 
Let y denote the angle which the tangent to a logarithmic spiral r = const. 


| 

t | 

| 

| 

e 


434 P. 0. BELL. 


makes with the radius vector p drawn from the pole (asymptotic point of the 
logarithmic spiral) to the point of contact of the tangent. The tangent of 
this angle is defined by the relation tan y = pd6/dp =— p/v. Since Vr is 
given by »-+ w and it is invariant of the transformations (3.6), tany 
is itself an invariant. Let @ denote the angle of »+ iv. It follows that 
tan ¢ tan y = — 1, and therefore, one of the relations y = @ + w/2 is satisfied, 
If the direction of the 6 const. curve is known at a point, the direction 
of the point of the loxodrome which corresponds to a given value of 1 is 
easily determined by this relation. 

To determine the fixed points of a homographic mapping which maps a 
curve C upon itself, let w(t) be a generic point of the curve. Then any point 
P of the plane is given by an expression of the form » + uw’ where w is a 
suitable complex number. Let wu be a constant to be determined by the 
condition that the point P be a fixed point. This condition is represented 
by a relation of the form (w + ww’)’ = A(w + uo’) where A is a proportionality 
factor. Substituting — rw for w” in the left member of this relation yields 
—ruw = d(w + uw’). This relation implies that r?u since the 
points w, w’ are linearly independent. The homogeneous coordinates of the 
two fixed points of the mapping are, therefore, given by the forms 


(3. 12) f=o+W/Vr, 7 =o — WwW’ /Vr. 


Taylor’s development for #(f-+h) along an r= const. curve yields, in 
view of (2.2) 


(3. 13) o(t +h) =o(t) (1 — rh?/2 + r?h*/4!- -) 
+ (A — rh?/3! + -) 
=wcos Vrh + o’(sin Vrh) /r. 


Since the above series represent integral functions, the representation is valid 
for all real values of h. Solving (3.12) for » and o’ in terms of £, 7 and 
substituting the expression for them in (3.13) yields 


(3. 14) o(t +h) =fexp(—iVrh) + vexp(iVrh). 


Since the general homogeneous coordinates of the points £, 7 are independent 
of h, these points form a convenient basis for the definition of local homo- 
geneous coordinates. The local homogeneous coordinates of the point 
w(t-+h) with respect to this basis will be the coefficients of £ and 7 in the 
form (3.14), and the local non-homogeneous coordinates of this point will 


| 
( 
t 
n 
| 

a 

T 
poi 
ger 

® 
in 
(th 


a 


DIFFERENTIAL GEOMETRY OF A GENERAL MOTION. 435 


be u==exp(2iVrh). The double points £, » of a homography which maps 
a curve C upon itself lie on the curve if the mapping is loxodromic or 
hyperbolic; they coincide in a point of C if the mapping is parabolic; or 
they do not lie on the curve if the mapping is elliptic. 

It has been shown that a mapping of a curve C upon itself by a trans- 
formation (3.2) is represented by a parametric displacement /=t-+h. If 
the displacement is defined in another parametric system by 1=v-+k, 
it follows that v and ¢ are related by an equation of the form 


(3. 15) v=ct+d 


where c and d are constants. For, let v be a continuous function of ¢ with 

a derivative defined for each value of ¢; then v(¢+h) —v(t) =k(h), 

where U(¢) =v(¢--h). Dividing the members of this equation by h and 

letting h tend to zero yields the relation v’(t) —=lim k(h)/h. Since the right 


member is independent of ¢, v’(¢) = constant, and therefore v is given by 
(3.15). 

It may also be shown, although the proof will be omitted here, that if 
two distinct displacements of the points of a curve, represented in different 
parameters by and result from two homographie 
mappings of the z plane which have the same pair of fixed points ¢, 7 the 
parameters v, ¢ are related by an equation of the form (3.15). 


4. A homographic metric. To establish a suitable local reference frame 
for the definition of homogeneous coordinates of a moving point on a curve 
(, let w(t) denote normal coordinates of the point. Consider the r = constant 
curve which is tangent to C at o. A mapping of the plane which maps this 
curve upon itself has two fixed points given by 


Ww’ /Vr, 7 =o— r. 


The three points w, £, 7 determine a circle T, except when ¢ and y coincide. 
The harmonic conjugate of » with respect to ¢ and y is the point w’. The 
points w, w form a suitable basis for a local coordinate system since homo- 
geneous coordinates of any point in the plane are expressible in the form 
o+ wo’ where u is a complex number. If £ and y coincide, they coincide 
in the point w’. It may be easily shown that the circle T determined by a, €, 
7 makes with C at w an angle equal to the angle of the complex number t/V r 
(the proof will be omitted). This angle is 7/2 if r is real and positive; 


5 

| 

; 

] 

F 

S H 

- 

) 


436 P. O. BELL. 


it is zero if r is real and negative. The circle I is, therefore, tangent to ( 
at w if at » the function r(¢) is real and positive. 

The homographic element of are length of a curve C from a point w(t) 
to a neighboring point w,, given by o, =w(t-+ dt) will be defined by the 
invariant differential form 


(4.1) do =2V |r| dt. 


Geometric interpretations will now be given for the forms do and do*. The 
element do is given by the relations 


(4. 2) da = 2| P(ww’w,f)| = | P(log(qww,) )| , 


in which the principal value of the logarithm is used and P is used to denote 
the principal parts of the cross-ratio (ww’w,f) and the logarithm of the cross- 
ratio ({jww,). These results may be verified by making use of the local non- 
homogeneous coordinates 0, «0, dt, iVr, and —iVr of the points o, w’, w, 
¢ and », respectively. Let Tf denote the circle determined by the points «, 
¢, » and if this circle is distinct from the circle T it intersects T in a point 
which will be denoted by wz. The general homogeneous coordinates of wo, 
are found to be given by the relation w, = w—vo’/|r|dt. If, however, the 
circles F and I, defined at w, coincide, then r < 0 (real) and the expression 
for w. defines the point w+ ’/rdt which is the harmonic conjugate of the 
point =o’ (t+ dt) with respect to the points w’. In either case the 
point w., may be used to complete geometric interpretations of do*. These 
interpretations are given by the equations 


(4. 3) 4P (log (ww,0'ws ) ) 4P = do’. 


Let the independent parameter ¢ represent time. The motion of the 
point w(t) along an r = constant curve will be called loxodromic, hyperbolic, 
elliptic, or parabolic, according as r(¢) is a complex imaginary, a negative 
real, a positive real, or zero, respectively. A normal non-homogeneous co- 
ordinate u, called the abscissa, of a generic point w(t) of an r = const. curve 
will be determined for each type of motion. An angle @ will be determined 
in association with each type of motion in terms of which the abscissa of o 
and, consequently, the homographic arc length of the curve will be simply 


oxpressed. 


Case 1. r= W)?, v= const., v0. Normal coordinates of 
(independent solutions of (1.2)) are 


w' = exp(—ivrt), w* =exp(tV rt), 


( 
_ 
6 
80 
(4 
Hi 
the 
(4 
In 
Po 
Th 
(4. 
Th 
is t 
(4. 
geo 
(4. 


DIFFERENTIAL GEOMETRY OF A GENERAL MOTION. 
where Vr=p-+ 1. The abscissa of w is defined by 
(4. 4) U = = exp{2 (— v + in) ft}. 


Let pt = 6, so that 


(4. 5) log u = 2a6 + 216 
where a==—v/p. Hence 
(4. 6) do = V p? + v? dt = 2 sec dd 


where ¢ = arctan v/u. Homographic arc-length of C between points where 
6=6, and 6=6, is therefore given by 


(4.7) o = 2(6, — sec 
Case 2. r=—k*. Normal coordinates of wo are w!—exp(— ht), 


wo =exp(kt). The abscissa of w is defined by u=exp(2kt). Let u = tan 8, 
so that 2k¢ = log tan 6 and 


(4. 8) do = d(log tan 0) = 2 esc 20d8. 

Homographic are-length of C between points where and is 
therefore given by 

(4. 9) o = log(tan 6,/tan 6). 


In case the curve C is a “ non-euclidean” straight line in the half-plane of 
Poincaré, the formula (4.7) defines the Poincaré metric of C (see [1, pp. 
37-39] and [8, pp. 7-8].) 


Case 3. r=k?. Normal coordinates of are cos kt, w? = sin kt. 
The abscissa of w is defined by u=—tan kt. Let u = tan @, so that kt = 6 and 


(4. 10) do = 


The homographie arc-length of ( between the points where 6 = 6) and 6 = 0, 
is therefore 


(4.11) o = 2(0,— 4). 
Case 4. r=0. Normal coordinates of are ow! —1, The local 


non-homogeneous coordinate of w is defined by Here do but a 
geometric interpretation may be given for a parameter 6 such that 


(4. 12) u = tan 6. 


438 P. O. BELL. 


The geometric meanings of the parameters 6 satisfying the above relations 
in the various cases will now be given. Consider a pencil of circles passing 
through the fixed points {, 7 of a loxodromic mapping corresponding to case 1, 
Let any selected one of these circles C, serve as a basis; the angle at ¢ 
between the circle (, and the circle of the pencil which passes through « 
is a parameter 6 which satisfies equation (4.5). To prove this assertion, 
note that the equation (4.5) is the polar equation of a logarithmic spiral 
one of whose points is infinite and the other of which is the pole of the 
coordinate system, the angle 6 being the polar angle of the point w. The 
interpretation of an angle 6 for the case of an r=const. curve (Case 1.) 
with finite fixed points {, » is obtained immediately on subjecting the 
logarithmic spiral and the @= const. lines of the polar coordinate system to 
a suitable homography which transforms the pole and infinite point into the 
points £, » and transforms the initial line into the circle C,. 

To obtain a geometric interpretation of a parameter 6 in terms of which 
de is given by (4.8), construct the circle C, through the points , » orthogonal 
to C. Select on C, an arbitrary point J distinct from { and y. Construct the 
circle C, determined by the points J, £, o. The angle between C, and C, 
at £ is a parameter 6 in terms of which do is given by (4.8). To prove this 
let 6 denote this parameter and calculate the non-euclidean element do 
in terms of 6. Transform the plane by a homography which transforms &, 7 
into given points @, 7 and the point J into the point J at infinity. The trans- 
form © of C is, therefore, a circle orthogonal to the straight line transform 
C, of C,, and the transform C, of C, is a straight line (passing through the 
points 7, Z, ©) which makes with C, the angle 6. Let IT, denote the circle 
determined by J, £, o:. The transform f, of this circle is a straight line 
through { making with C, the angle 6+ d6. Since the element do is an 
invariant of homographies, from equations (4.2) it follows that 


ds = P (log = log (€@) (701) / (7) = d(log tan 6). 


This completes the proof. 


A parameter 6 in terms of which do, for Case 3, is defined by (4. 10) 
may be geometrically characterized as follows. A circle ( at whose points 
r =k? is orthogonal to all of the circles of a pencil which pass through the 
fixed points £ and 7. Let C, be any selected circle of the pencil and let T 
denote the circle of the pencil which passes through the point o. The angle 
which the circle Tf makes with the circle C is equal to 20, where @ is a 
parameter of the type sought. For the proof transform by a homography 
the point » into the point 7 at infinity; the pencil of circles transforms into 


a 
f 
( 
p 

0 
0 
f 
i 
f 
( 

fi 
h 
I 
( 
p 
d 
i 
WwW 
0 
p 


DIFFERENTIAL GEOMETRY OF A GENERAL MOTION. 439 


a pencil of lines with center at f orthogonal to the transform C of C. It 
follows that the homographic element do is given by 


do = | P(log(/aa,) | = | P(log(%)/ ) | = 


A parameter @ in terms of which the abscissa of w is given by the relation 
(4.12) may be ‘characterized as follows: Select on the circle C an arbitrary 
point J and construct a circle C, which passes through ¢ and J and is 
orthogonal to C. Select on C, any other point # and through the point o 
of C draw the circle [ determined by the points R, J, and w. The angle 
formed at R by the two circles C, and T is a parameter @ in terms of which u 
is given by the relation (4.12). The proof, which is similar to those of the 
former cases, will be omitted. 

A generic point w of an r=constant curve is given by 


(4.13) wo =f-+ nexp(%Vrt) 


in which £, 7 denote the double points of a mapping of the curve upon itself. 
The proofs of the following well known properties, by the methods of the 
present paper, may be readily supplied by the reader on making use of the 
fact that a homographic mapping is conformal. A homography other than 
the identity which maps a curve upon itself maps the curves of a one para- 
meter family upon themselves. If the mapping is (1) lorodromic, (2) hyper- 
bolic, (3) elliptic, or (4) parabolic, the family consists of (1) non-degenerate 
lovodromes, (2) circles which pass through the double points of the mapping, 
(3) the orthogonal trajectories of the circles which pass through the double 
points of the mapping, or (4) the circles tangent to a given circle at the 
double point of a parabolic mapping, respectively. 


5. The geometry of the absolute. In keeping with the geometric 
characterization given by (4.2) for the element of homographic arc length 
of a curve, let the homographic distance between two points P, Q with respect 
to an absolute (£,7) be defined by 


(5.1) PQ = | log(&mPQ)| 


in which the principal value of the logarithm is used. In the geometry of 
the absolute (£, 7) in which the metric is defined by (5.1) a “ straight” line 
will be defined to be a curve whose homographic arc length between any two 
of its points P, Q is equal to the homographic distance between these two 
points. To avoid confusing these lines with ordinary straight lines they will 


il 
e 
e 
) 
e 
e 

l 
e 
] 


440 P. 0. BELL. 


be called s-lines. To find the s-lines of the plane let a generic point o of an 
s-line be given by 


(5. 2) om w= r+ is 


where + is a function of @ to be determined. Let +(6)) be denoted by 7; 
the homographic distance between two points w, » now assumes the form 


(5. 3) | log (Lywow) | = [ (7 — + (8 — 


Hence, the condition that the locus of be an s-line is that 
Wo 


for any two points wo, w of the locus. If = const., differentiating equation 
(5.4) with respect to 0, squaring both sides of the resulting equation, and 
clearing of fractions yields the equation (@— 0))dr = (r—v.)d6 whose 
general solution is of the form 


(5. 5) 7= a0 + b, a, b = const. 
Let p denote | e’ |, then p is given by 


(5. 6) p= ce%, a, c = const. 


Since equation (5.2) with p given by (5.6) defines a two parameter family 
of curves, it is clear that only one s-line passes through any two given points 
which are distinct from the points of the absolute, and only one s-line passes 
through a given point (distinct from ¢ or ») in a given direction. The 
s-lines thus determined by (5.2) and (5.6) are in general loxodromes, and 
along an s-line where 6+ a constant these loxodromes are non-degenerate. 
The s-line along which a0 are the orthogonal trajectories of the circles 
6) == constant which pass through the points and 7. 

Along an arc of a 6 = const. circle the condition (5.4) assumes the form 


T— To = dr 
To 


which is satisfied identically in r. Since the values of r+ which correspond 
to the points ¢ and 7 of the absolute are — o and + oo respectively, an arc 
of a circle contained between £ and 7 is an entire s-line. The following 
general result may now be stated: An s-line is a curve which makes a constant 
angle with the circles which pass through the points of the absolute. If the 


tl 
fe 
d 
t 
| pi 
| 
be 
fo 
C0 
g1 
a 


n 


DIFFERENTIAL GEOMETRY OF A GENERAL MOTION. 441 


angle is not a right angle the s-line is a loxodrome whose asymptotic points 
are £ and y or an arc of a circle whose endpoints are £ and y. An r =const. 
curve is an s-line with respect to the absolute formed by the fixed points £, y 
of a mapping of the curve upon itself. 

In the geometry of the metric (5.1) the points £, 7 of the absolute are 
the points “at infinity,” that is, the homographic distance from either ¢ 
or » to any other point in the plane is infinite, whereas the homographic 
distance between any two points distinct from ¢ and 7 is finite. The numbers 
6, r will be called normal curvilinear coordinates of the point w with respect 
to the absolute £, y. It is clear that a set of coordinates 0, + determines by 
(5.2) one and only one point ». The transformation (5.2) maps the lines 
r=— © and r—-~ oo into the points ¢ and y, respectively, but otherwise 
maps the entire finite w-plane conformally upon the points of the z-plane. 
A straight line of the w-plane corresponds to an s-line of the z-plane, and 
(5.3) shows that the homographic distance between any two points wo, w of 
the z-plane is equal to the ordinary distance between the corresponding points 
in the w-plane. The geometry of the z-plane (except at the points ¢ and y 
(“at infinity ”) is therefore euclidean. This, of course, implies that the sum 
of the interior angles of any triangle formed by s-lines is equal to w radians 
(a result which can also be easily obtained by making use of the fact that 
an s-line cuts at a constant angle the circles which pass through the points 
¢and 7). The curious case of an s-line triangle with one vertex w, super- 
posed upon another w, occurs if 6, differs from % by an integral multiple of 
2n radians; in this case the side wow, is a complete circumference of a circle. 

Since the arcs fy of circles and the family of orthogonal trajectories are 
the families of = const. curves and +=—const. curves, respectively, these 
families are families of equidistants, in the geometry of the absolute, in which 
distances between members of one family are measured along the members of 
the other family. The members of each of these families will be called 
parallels with respect to the absolute. Moreover, in general, distinct curves 
which make equal constant angles with all of the 6= constant curves will 
be called parallel; since such curves cannot intersect in any point (p, 6) 
distinct from the points of the absolute. 

A congruent transformation in the geometry of this metric is a trans- 
formation which preserves the points of the absolute; two figures which 
correspond in a congruent transformation are called congruent figures. The 
mapping of figures in the z-plane which are congruent with respect to homo- 
graphies by the transformation (5.2) produces figures in the w-plane which 
are congruent with respect to the ordinary rigid displacements. It is, there- 


13 


| 
d 
y 
e 
d 
d 
it 
|_| 


442 P. O. BELL. 


~ 


fore, logical to call the homographies which preserve the points of the absolute 
(the points ¢, » “at infinity ”) displacements with respect to the homographic 


metric. 
UNIVERSITY OF KANSAS. 


BIBLIOGRAPHY. 


[1] E. Cartan, Lecgons sur la théorie des espaces a connexion projective, Paris, Gauthier- 
Villars (1937). 

[2] W. F. Osgood, Lehrbuch der Funktionentheorie, Leipzig, B. G. Teubner (1912). 

[3] H. Liebmann, “ Beitrage zur Inversionsgeometrie der Kurven,” Sitzungsberichte der 
Bayerischen Akademie der Wissenschaften zur Miinchen, Heft 1 (1923). 

{4] T. Kubota, “ Beitrage zur Inversionsgeometrie,” Tohoku Imperial University, 
Science Reports, vol. 13 (1924-25), pp. 243-251. 

[5] Jusaku Maeda, *“‘ Geometrical meanings of the inversion curvature of a plane curve,” 
Japanese Journal of Mathematics, vol. 16 (1940), pp. 177-232. 

[6] Frank Morley, “On differential inversive geometry,” American Journal of Mathe- 
matics, vol. 48 (1926), pp. 144-146. 

[7] B. C. Patterson, “The differential invariants of inversive geometry,” American 
Journal of Mathematics, vol. 50 (1928), pp. 553-568. 

[8] H. Poincaré, “ Théorie des groupes fuchsiens,” Acta Mathematica, vol. 1 (1882), 
pp. 1-62. 

[9] P. O. Bell, “A characterization of the group of homographic transformations,” 
Bulletin of the American Mathematical Society, vol. 47 (1941), pp. 488-493. 


pr 


{ 

a 

a 

al 
ir 
th 
in 
bu 
fie 
of 
8a] 
as] 
Jar 
of 
fielc 
tifig 


DOUBLE VECTOR SPACES OVER DIVISION RINGS.* 


By G. HocHScHILD. 


Introduction. Let J/ be a (left) vector space over the division ring D. 
Let H be an arbitrary ring with an identity element, and suppose there is 
given an antihomomorphism of / into the ring of D-linear transformations 
in M, such that the identity element of # is mapped into the identity trans- 
formation. If we denote by #* the ring of right multiplications zr —> e* {zx} 
=ze in then the mapping e—e* is an anti-isomorphism of onto E%*, 
and the situation described above amounts to having a representation of H* 
as D-linear transformations in MM. 

Under these circumstances, it is convenient to think of M as having 
simultaneously the structure of a left D-space and that of a right H-space, 
and to indicate the corresponding scalar operations as m—>d-m, for de D, 
and as m —> m-e, for ee EL. Our above requirements may then all be absorbed 
in the statement that the “ - products” should behave like ordinary products. 
We shall say then that W has the structure of a (D, #)-space. 

The theory of (D,E)-spaces may, of course, be subsumed under the 
theory of matrices with coefficients in D. A study of matrices with coefficients 
in a division ring which is relevant here was made by R. Brauer in 1941,’ 
but this overlaps only slightly with what we propose to do. . 

In the special case where D and EF both coincide with a commutative 
field F, the study of (F, /’)-spaces amounts to the same as Jacobson’s theory 
of self-representations.* Jacobson has shown that the self-representations 
provide a generalization of Galois theory to general field extensions, not neces- 
sarily normal, or even separable. Part of our program is to consider this 
aspect in the non-commutative case. 

The Galois theory for division rings has recently been developed by N. 
Jacobson * and, independently, by H. Cartan.‘ Their chief tool is a theorem— 


* Received January 30, 1948; revised July 20, 1948. 

1R. Brauer, “On sets of matrices with coefficients in a division ring,” Transactions 
of the American Mathematical Society, vol. 49 (1941), pp. 502-548. 

*N. Jacobson, “ An extension of Galois theory to non-normal and non-separable 
fields,” American Journal of Mathematics, vol. 66 (1944), pp. 1-29. 

3N. Jacobson, “ A note on division rings,” ibid., vol. 69 (1947), pp. 27-36. 

‘H. Cartan, “ Théorie de Galois pour les corps non commutatifs,” Annales Scien- 
tifiques de VEcole Normale Supérieure, vol. 65 (1948), pp. 60-77. 
443 


te 
ic 
er 
an 


444 G. HOCHSCHILD. 


due to Jacobson and to N. Bourbaki—on rings of additive endomorphisms of 
a division ring. We shall state this theorem in a slightly generalized form 
in 2 which also contains a completely elementary proof. In 8, we give a 
brief indication of how this theorem is used in order to establish the main 
results of the Galois theory. 

An important feature of this approach is that the Galois correspondence 
between division subrings of a division ring D and groups of automorphisms 
of D is derived from the correspondence between division subrings K of D 
and the rings of all K-linear transformations in D. This last correspondence 
may be regarded as a generalization of the usual Galois correspondence. The 
role played by a single automorphism is then taken over by an indecomposable 
space of additive endomorphisms of D. 

As Jacobson has shown, the full operator ring, or the “ composite,” of a 
self-representation of a field F is determined to within an equivalence by a 
certain space of additive endomorphisms of F, its so-called relations space, 
and it is precisely this fact which ties up the theory of self-representations 
with the Galois theory. In the non-commutative case, there arises the 
difficulty that the composite of a representation may easily fail to have the 
necessary finiteness properties, and cannot be made a part of an adequate 
theory. There is, however, another aspect of the composite which is capable 
of a satisfactory generalization. 

Abstractly, a composite of two rings D and F with identity elements isa 
ring R which is generated by homomorphic images D’ and £’ of D and F£, 
respectively, in which every element of D’ commutes with every element of 
E’, and where the identity elements of D and EF are both mapped into the 
identity element of R. Clearly, R carries the structure of a (D, £)-spacz 
in the natural way, and F is generated as a (D, #)-space from its identity 
element. This has led us to introduce in 5 the notion of a relative cyclic 
(D, E)-space, i.e., a (D,E)-space M in which we have singled out an 
element m such that D.m.E—=M. To such a space we attach a certain 


space of additive homomorphisms of EF into D, its relations space, which | 
coincides with Jacobson’s relations space in the case where M is the con- 


posite of a field F with itself. 


In 5 and 6, we establish the main facts concerning the correspondence | 


between relative cyclic spaces and their relations spaces. In 7, we apply 
these facts to the particularly interesting case wheré EZ is a division ring and 
D a division subring of E. In this case, the product of two relative cyclic 
(D, E)-spaces is always defined and leads to a notion of closure for such spaces. 
A space is closed if and only if its relations space is a ring. Furthermort, 


cons’ 
need 
are y 


( 
0 
f 
0 
0 
il 
Ov 
of 
ele 
sh 
of 
ove 
abo 
we 
Sin 
as 
our 
of 
d 
spac 


DOUBLE VECTOR SPACES OVER DIVISION RINGS. 445 


the closed spaces are exactly the Kronecker product spaces DX E, taken 
with respect to some division subring of D. 

There is another type of double vector space which is useful in studying 
extensions of a division ring K. This is the K-regular space defined in 4. 
The purpose of the restriction to K-regularity is to make the transformations 
corresponding to the elements of K as simple as possible without losing too 
much generality. For the case where D is a finite normal extension of K, 
and £ is a division ring between K and D, we obtain in 4 a complete theory 
of K-regular (D, #)-spaces which reflects the main facts of the Galois theory 
for D over K. In this case, the situation is so simple that we have no need 
of relations spaces. 

I wish to express my thanks to Prof. R. Brauer whose critical comments 
on an earlier version of this paper have been very valuable to me in the 
revision and to whom I am indebted for a number of improvements, notably 


in 2 and 4. 


1. Kronecker products. Let D be a division ring. By a vector space 
over D we shall mean an additive group A, together with an isomorphism ¢ 
of D into the ring of endomorphisms of A, such that—if 1 is the identity 
element of D—¢{1} is the identity mapping in A. If de D and ae A we 
shall write d- a for (¢{d}) {a}. The - operation has then the formal properties 
of a multiplication. 

What we have just described is frequently called a left vector space 
over D. The notion of a right vector space is obtained by replacing the 
above isomorphism ¢ by an anti-isomorphism. In order to avoid confusion, 


“vector space” only to denote a left vector space. 


we shall use the phrase 
Similarly, by the dimension of a space, we shall always mean its dimension 
as a left vector space. We shall absorb the notion of a right vector space in 
our terminology by means of the following device: Let D* denote the ring 
of right multiplications in D. We make use of the anti-isomorphism 


d—+d* of D onto D* which we have defined at the beginning of the intro- 


| duction in order to regard a right vector space B over D as a left vector 
space over D*. We shall frequently write. b-d for d*-b, i.e., we shall 


adhere to the usual formalism for right vector spaces. 
From a vector space A over D and a vector space B over D* we can 


construct a certain additive group, called the Kronecker product B xX A. We 


need not give such a construction here, since quite a variety of suitable ones 
are well known. We shall merely enumerate certain fundamental properties 
which characterize the Kronecker product. 


of 
m 
a 
in 
ce 
D 
ce 
ne 
le 
a 
a 
ns 
he 
he 
te 
yle 
of 
he 
ty 
lie 
all 
in 
ch 
ce 
ly 
nd | 
lic 
es, 
Te, 


446 G. HOCHSCHILD. 


(1) To every pair (b,a), where be B and ae A, there corresponds an 
element b X a of BX A. 


(2) The elements b X a generate B X A, i. e., every element of BX 4 
is the sum of a finite number of such elements. 


(3) +62) Xa=b, Xa. 
(4) OX (4 a. 
(5) b-dXa=bXd-a, for deD. 


(6) If (a,) (ye some finite set T) is a set of D-linearly independent 
elements of A, and if b(y) are elements of B, not all equal to 0, then 


b(y) XK ay A 0. 
ver 


(7) If (bs) is a finite set of D*-linearly independent elements of B, 
and if a(8) are elements of A, not all 0, then } bs X a(8) 0. 


Let S be a D-linear transformation in A. Then S induces in BX d4 
an endomorphism S’ such that S’{b XK a} =b X S{a}. Similarly, if T 
is a D* linear transformation in B then 7 induces in BX A an endomor- 
phism 7” such that T’{b K a} =T{b} Xa. We shall apply these facts to 
the following situation: Let B be an (H#,D)-space and A a (D, F’)-space, 
where # and D are division rings. Then we may construct the Kronecker 
product B X A with respect to D as above, and we see that now B X A carries 
the structure of an (£,F)-space such that, for ee FE and feF, we have 
e-(b Xa) =e-b Xa, and (b Xa): f=bXa‘f. 


2. Additive homomorphisms of division rings. Let S be an arbitrary 
set, D a division ring. The additive group of all mappings of S into D 
may be regarded as a vector space over D*, the ring of right multiplications 


in D, where the scalar multiple of a mapping T by an element d* e D* is | 


defined as the ordinary product d*7 (i.e., T followed by d*). We shall 
require the following elementary lemma: 


Lemma 2.1. Let 7T,,---,Tn be a set of mappings of the set S into 
the division ring D which are linearly independent with respect to D*. | 
Then there exist n linearly independent D*-linear combinations U,,- Us 
of the T; and elements s,,- - -,8, in S, such that = where 8; =1 
and 8; =0 if 


\ 

( 
( 

a 

i 

0 
Wi 
fc 


an 


DOUBLE VECTOR SPACES OVER DIVISION RINGS. 447 


Proof. Suppose we have already found s,,- - -,s, in S and independent 
D*-linear combinations T,,- - -, 7, of the T;, such that T; {s;} = 
for 1Sisnand1=j=k. Here, we wish to include also the case k = 0, 
where we have no elements in S, and where 7; is to stand for 7;. Then, 
if k<n, there is an element s,,¢S such that Tz, We set 
Tra“) == {Siar} so that {5:41} For every 
jAk+1, we set = — . Then we have 
{s;} = 8, for 1S and 1[j=k-+1, and the are still 
linearly independent. Proceeding in this fashion, starting from the case 
k=0, we finally obtain s;,---,s, in S and U;=T;,™ which satisfy the 
requirements of the lemma. 

Now let D be a division subring of the division ring EZ. The additive 
homomorphisms of / into D constitute a ring which is also a (D*, #*)-space, 
all operations being defined in the natural fashion as ordinary compositions 
of mappings. The most important subrings of this ring are those which 
are also (D*, E*)-subspaces : 


Definition 2.1. Let A be a subring of the ring of all additive homo- 
morphisms of the division ring # into the division subring D of FE. A is 
called complete if A (0), and if D*AE* = A. 

If A is complete then the only element of # which is mapped into 0 
by all the elements of A is 0. In fact, if we had 0S4e,eF such that 
U{eo} =0 for all Ue A then, for every VeA and every ee LH, we should 
have V{e} = (V(e.'e)*) {eo} = 0, contradicting our hypothesis that: A (0). 

The fundamental result concerning complete rings is the following: 


THEOREM 2.1 (Jacobson-Bourbaki). Let A be a complete subring of 
the ring of additive homomorphisms of E into DCE. Denote by K the 
division subring of D which consists of all elements de D for which 
U{de} = dU {e}, for all ee E and all UeA (i.e., whose left multiplications 
e—>de commute with all the elements of A). Then if either E is of finite 
dimension [E: K] over K, or tf A is of finite dimension [A: D*] over D*, 
we have [F: K] =[A:D*], and A coincides with the ring of all K-linear 
mappings of E into D. 


Proof. If there are n D*-linearly independent elements of A we can 
apply lemma 2.1 to find elements e,,- - -,en in EF and elements U,,-- -,U» 
in A such that U;{e;} = 8;;. Then the e; are evidently linearly independent 
over K. Hence if F is finite over K then A must be finite over D*. Thus, 
we may suppose that [4: D*] =n, and that the above U; constitute a basis 
for A over D*. 


= 
nt 
en 

B, 
A 
T 
or- 
to 
ce, 
ey 
ive 
ary 
D 
ons 
is 
all 
nto 
Us 
1 


448 G. HOCHSCHILD. 


Then, if U is an arbitrary element of A, we must clearly have 
n 
U => U{ex}*U;. 
i=1 
Let Ve A, ee LF, and set U = Ve*U;. Then Ue A, and the last relation 
gives 


VerU; = (V{Uj{e:}e}) *U; = V{e}*U;. 
i=1 


Hence, for every ese FE, V{U;{eo}e} = Uj{eo}V{e}, which means that 
U;{E} CK. 


n 
Now, for ee set e’ > Ux{e}ex. Then U;{e’} =0, for all i, 
k=1 


whence U {e’} =0, for all Ue A. Hence e’ =0, = > Ux{e}ex. Thus, 
k=1 


€:,° 18 a basis for over K, so that K] =n=—[A: D*]. Finally, 
if T is any K-linear mapping of F into D we have clearly 


T {ex} = ( (T{ex})*Ux) {er}, 
whence 


T =D A. 
k=1 


Since every element of A is evidently a K-linear mapping, our proof is 


complete. 


3. Galois theory. Just as in the case of fields, the following property 
of “normality ” of an extension is fundamental in the Galois theory: 


Definition 3.1. Let D be a division ring containing the division ring 
K. We say that D is normal over K if every element of D which is left 
fixed by every automorphism of D over K belongs to K. 

The results of the Galois theory are obtained by applying Theorem 2. 1 
to the rings of endomorphisms of the additive group of D which are generated 
by the right multiplications and the elements of a group of automorphisms 
of D. An important fact which we shall make use of later is the following: 

Let D be normal and of finite dimension n over K. Then the ring of 
all K-linear transformations in D has a basis over D* which consists of 1 
automorphisms of D over K. 

The ring of all additive endomorphisms of D has the structure of a 
(D*, D*)-space in a natural manner. If g is an automorphism of D then 
gd* = g{d}*g, whence we see that D*g is a (D*, D*)-subspace of the space 
of all additive endomorphisms. 


( 

( 

t 

a 

I 

d 
a 
a 

sp 
W 
T 
a’ 

m 
de 
of 

onl 


DOUBLE VECTOR SPACES OVER DIVISION RINGS. 449 


Conversely, if I is any (D*, D*)-space of additive endomorphisms of D 
which is of dimension 1 over D* then M contains an element g which is a 
ring automorphism of D, and D*g=M. In fact, g is the unique element 
of M which maps 1 into 1. 

If D is finite and normal over K it can be shown that every indecom- 
posable (D*, D*)-space of K-linear transformations in D is of dimension 1 
over D* and hence is generated by a unique automorphism of D over K.® 

Noting further that (D*g,)(D*g.) = D*9.92, we see that if we regard 
the (D*, D*)-spaces of additive endomorphisms of D as generalizations of 
sets of automorphisms of D the ordinary multiplication, in the ring of all 
additive endomorphisms, of two such subspaces is the analogue of the multi- 
plication of sets of automorphisms. An indecomposable (D*, D*)-space of 
additive endomorphisms is the proper analogue of a single automorphism. 
It must be kept in mind, however, that, while the product of two automor- 
phisms is again a single automorphism, the product of two indecomposable 
spaces of additive endomorphisms need, in general, not be indecomposable. 


4. Double vector spaces over normal division rings. 


Definition 4.1. Let D and EF be division rings, each containing the 
division ring K. A (D,£F)-space V is said to be K-regular if there exists 
a subset V, of V such that D-V)-H=V and k-v=v-k, for all ke K 
and all ve Vo. 

Suppose that D FE ~ K, and let g be an isomorphism of EF into D 
which leaves the elements of K fixed. Then we construct a K-regular (D, E)- 
space as follows: 


The underlying additive group is a copy, D’ say, of that of D. 
We denote by d—>d’ an isomorphism of the additive group of D onto D’. 
The scalar operations in D’ are defined by setting d,-d’==(d,d)’, and 
d’-e== (dg{e})’. 

Clearly, the set consisting of the element 1’ alone satisfies the require- 
ments for V, above. Thus we have a K-regular (D, £)-space which we shall 
denote V(qg). 

If g, and gz are two isomorphisms of # into D which leave the elements 
of K fixed then V(g,) and V(g2) are (D,£)-operator isomorphic if and 
only if there is an inner automorphismya of D such that 9g, = agp. 

In fact, denote the additive groups of V(g, ) and V(g-) by D’ and D”, 


5 This follows, for instance, from the results in §§ 4, 5 and 6 of this paper. 


450 G. HOCHSCHILD. 


respectively. If g,{e} = dogz{e}do', with some fixed for all ee we 
define a mapping ¢ of D’ onto D” by setting ¢{d’} = (dd,)”. Then 


d,- p{d’} = (d,dd))” = ${(d,d)’} = ${d, - a’}, 
and 


p{d’} = (ddog.{e})” = (dgife}do)” = (dgite})"} = e}, 


i.e., @ is an operator homomorphism. Moreover, it is evident that the kernel 
of ¢@ is zero and that ¢ maps V(g,) onto V(g2). 


Conversely, if @ is an operator isomorphism of V'(g,) onto V(gz2), and 
if o{1’} =d,”, then we deduce that gi {e} = In fact, O{1’}-e 
= o{gi{e}}, Le, = (gile}do)”, or 
(dogz{e})” = (gi{e}do)”, whence the result. 

Now let us suppose that D is finite and normal over K and consider 
the Kronecker product D X FE with respect to K, where DD ED K. 

This is a K-regular (D, /)-space. In fact, DX H=D- (1X 1)-E, 
and k-(1X1)=k#X1—=1Xk=(1X1)-4h, for every ke K. 

Let [D:K]—n. As we have remarked in 3, there exist n auto- 
morphisms g;,: -*,9n of D over K which form a basis over D* for the 
space of all K-linear transformations in D. Since every K-linear mapping 
of E into D can evidently be extended to a K-linear transformation in D, 
the restrictions g; of the g; to E must contain a basis over D* for the space 
of all K-linear mappings of EF into D. We may suppose that 9:,° - -, 4m 
is such a basis. Then m=[E:K]. 

For each index i, 1 [i ™m, there exists a homomorphism 2; of the 
additive group of D X E onto that of V(g;)—-whose elements we shall now 
denote by d‘‘)—which is characterized by the relations m{ > da X éa} 

a 


= (dagi{ea})‘, for all d,e D and ege Moreover, x; is evidently a 
a 


(D, E£)-operator homomorphism of the (D, F)-space D & FE onto the (D, E£)- 
space V(g;). 
Now let M be the direct sum of the V(g;), for 1S i= ™m, and let 
be the mapping of D X E into M which is defined by setting r{z} = (7, {2}, 
+, am{z}), for every ze DX EF. Evidently, is a (D, F)-operator homo- 
morphism. We claim that z is actually an isomorphism. 
Let z be in the kernel of Then ai{z} —0, for each 7. If +, 
is a basis for HZ over K we can determine elements dj;¢D such that 


> gi{ex}di; =8;. The element z may be written in the form z = > dy, XK &e 
i=1 


k=1 


spi 


e 
d 
d 
i 
te 
I 
tl 
01 
of 
0 
fi 
fo 
m 
di 
as 
re 
is 
B 
co 


DOUBLE VECTOR SPACES OVER DIVISION RINGS. 451 


We have then digi{ex} =0, for each 1, whence digifex}dij =0, for 
k=1 kt 
each j; 1.e., dj = 0, and thus z=0, which proves our assertion. 


Finally, we note that the dimension of M over D is equal to m, the 
dimension of D & E over D. Hence z maps D X EF onto M. 


We are now in a position to prove the following theorem: 


THEOREM 4.1. Let D be finite and normal over K, and let E be a 
division ring such that DD. EDK. Then every K-regular (D, E)-space 
is a sum of simple K-regular subspaces. Every simple K-regular (D, £)- 
space is (D, £)-operator isomorphic with a V(g), where g is the restriction 
to E of an automorphism g of D over K. 


Proof. Let N be any K-regular (D,F)-space. N is the sum of sub- 
spaces of the form D-n- EH, where ne N, and k-n=n-k, for every ke K. 
Hence it suffices to prove our theorem in the case where N =D-n- E£. 

In that case, we can evidently define a (D, £)-operator homomorphism 
y of D X E onto N which is such that y{ da XK = eg. Hence 

a a 


there is also an operator homomorphism h = wx of M = (V(9,),° ++, V(gm)) 
onto VN. Let W be the kernel of A. By a standard argument of the theory 
of semisimple groups with operators, there are indices %,- + -,t, such that 
M is the direct sum of W and (V(4gi,),- > -,;V(gi,)). Then h induces an 
operator isomorphism of (V(9i,),° V(9i,)) onto N, which proves the 
first part of our theorem, since the V(g;) are evidently simple. The rest 
follows by noting that if N is simple it must be of the form D-n-£, and s 
must be equal to 1. 

Notice that theorem 4.1 implies that every K-regular (D, E£)-space is a 
direct sum of simple K-regular (D, £)-subspaces, as a standard argument, 
using Zorn’s lemma, will show. 


CoroLtiaRy 1. Let N be a K-regular (D, E)-space, where D and E are 
as above. Then there exists a K-regular (D,D)-space N’ which, when 
regarded as a (D,F)-space, coincides with N. 


Proof. By the above remark, it suffices to show this for a simple (D, £)- 
space. By the second part of Theorem 4.1, we may assume that the space 
is V(g), where g is the restriction to F of an automorphism g of D over K. 
But then the (D, D)-space V(g) evidently satisfies the requirements of our 
corollary. 


G. HOCHSCHILD. 


Another consequence of Theorem 4.1 is the usual result on extensions 
of isomorphisms.°® 


CoroLuaRy 2. Let D be normal and finite over K, E a division ring 
such that DD EDK. Then every isomorphism of E into D which leaves 
the elements of K fixed can be extended to an automorphism of D over K. 


Proof. Let a be such an isomorphism of F into D. Then the (D, £)- 
space V(a) is simple and hence is (D, £)-operator isomorphic with a V(@), 
where g is the restriction to # of an automorphism g of D over K. As we 
have seen above, this implies that there is an element deD such that 
afe} —dg{e}d", for every ee The automorphism 2— dg{x}d™ is the 
required extension of a to an automorphism of D over K. 


A similar result is the following: 


CoroLuaRy 3. Let DDEDK, as above. Let a and b be two auto- 
morphisms of D over K, and let + be an additive homomorphism of E into D 
such that r{ K} = (0), and r{e,e.} = a{e,}r{e.} + r{e,}b {es}, for all e,, 
Then there exists an element de D such that r{e} =a{e}d — db{e}, for 
all ee E. 


Proof. We define a K-regular (D, E)-space V of dimension 2 over D as 
follows: The additive group of V is the direct sum of two copies D’ and D” 
of that of D. For d,’eD’, d.”eD’, and de D, we define d- (d,’, d,”) 
= ((dd,)’, (dd,)”), and, for eek, ((d,afe})’, (d,r{e} 
+ d:b{e})”). It is easily verified that this defines in V the structure of a 
(D, E)-space. If V, is the subset consisting of the elements 1’ = (1’,0) and 
1” = (0,1) we see that the requirements of definition 4,1 are satisfied, so 
that V is K-regular. 

Since D” is a K-regular (D, F)-subspace of V’, it follows from Theorem 
4.1 that there exists a K-regular (D,F)-subspace W such that V is the 
direct sum of W and D”. Let us write, in accordance with this decom- 
position, 1’=w-+td,”, where we W. Operating with ee HL, we obtain 
a{e}’ + r{e}” =w-e-+ (d,b{e})”. Operating with a{e} on the left, we get 


a{e}’ =af{e}-w + (a{e}d,)”. 
Hence, by comparison of the components in D”, 
r{e} = db{e} —a{e}d, ale} (—d,) — (— dh) b{e}, 
which proves Corollary 3. 


* This was proved by H. Cartan; see footnote 4. 


452 
| 
t 
01 
di 
fc 
C 
de 
fo 
di 
t 
Wi! 
D 
Te; 
(1 
on 
of 
au 
sir 
(I 
(L 
au 


DOUBLE VECTOR SPACES QVER DIVISION RINGS. 453 


It is easy to verify that the additive endomorphism x > a{x}d — db{x} 
of D still has the formal properties of 7. Thus, our result is also an extension 
theorem. If we take H = D and a=1=b, we conclude in particular that 
every derivation of D over K is an inner derivation. 


THEOREM 4.2. Let D be of finite dimension over K. Then D is normal 
over K if and only if every indecomposable K-regular (D,D)-space is of . 
dimension 1 over D. 


Proof. The necessity of the condition follows at once from the remark 
following the proof of Theorem 4. 1. 


Conversely, suppose that the condition of our theorem is satisfied. 
Consider the Kronecker product DX D with respect to K. As a vector 
space over D this is of finite dimension [D:K]. Hence DX D can be 
decomposed into the direct sum of a finite number of indecomposable (D, D)- 
subspaces. Say DX D=M,+---:+M,. In particular, 1X 
m,, with m;e M;. We have then D- 
for each i, and every ke K. Thus each M; is K-regular, and hence has 
dimension 1 over D, so that Mj = D-mj;. Hence, for every de D, m-d 
= gi{d}-m,, and it is evident that the mapping g; of D into itself which is 
thus defined is an automorphism of D over K. If an element de D is left 
fixed by every g; it follows that (1K 1)-d=d-(1X1),orl1X¥d=dX1, 


which implies that de K. Hence D is normal over K. 


Coroitary. Jf D is finite and normal over K, and DD ED K, then 


D is normal over E. 


Proof. Every indecomposable E-regular (D, D)-space is a fortiori K- 
regular, and hence of dimension 1 over D. Hence D is normal over £. 

We have seen above that if D is finite and normal over K the classes of 
(D, D)-operator isomorphic simple K-regular (D, D)-spaces are in a one to 
one correspondence with the elements of the group of outer automorphisms 
of D over K (i.e., the full automorphism group, reduced modulo the inner 
automorphisms). In this correspondence, the Kronecker product of two 
simple (D, D)-spaces in the classes determined by the outer automorphisms 
aand £, respectively, belongs to the class determined by a8. Hence: 


THEOREM 4.3. The classes of operator isomorphic simple K-regular 
(D, D)-spaces, under the operation induced by Kronecker multiplication of 
(D, D)-spaces, constitute a group which is isomorphic with the group of outer 
automorphisms of D over K. 


454 G. HQCHSCHILD. 


5. Relative cyclic spaces. 


Definition 5.1. Let D be a division ring, # an arbitrary ring with an 
identity element. Let M be a (D, £)-space which is generated by a single 
element me WM, i.e., M=D-m-E. Then the pair (M,m) will be called 
a relative cyclic (D, F)-space. 


Definition 5.2. Let (M,m) and (N,n) be two relative cyclic (D, £)- 
spaces. We shall say that (V,n) covers, or is a cover of, (M,m) if there 
is a (D, £)-operator homomorphism of N onto M which maps n into m. If 
(N,n) and (M,m) cover each other, i. e., if they are operator isomorphic in 
such a way that n and m correspond to each other, they are said to be 
equivalent. 

Let V be an arbitrary (D, #)-space, S an arbitrary subset of V. By a 
(D, F)-mapping of S into V we shall mean a mapping ¢ such that there 


exist elements d,,- - -,d, in D and in with = 
i=1 


for every se S. If, for de D and ee £, we define (d-$) {s} =d- o{s}, and 
(p- e){s} = ${s} +e, we obtain in the additive group of all (D, £)-mappings 
of S into V the structure of a (D, H)-space. The identity mapping 7 of § 
into V is evidently a (D, £)-mapping, and D-i- EF coincides with the set of 
all (D, F)-mappings of S into V. 

The set of all (D, #)-mappings of V into V has the natural structure 
of a ring, H say. On the other hand, if ho is the identity mapping of V 
into V, the pair (H,h,) is a relative cyclic (D, £)-space, according to what 
we have just seen. It is easy to characterize these spaces: In fact, if h, is an 
arbitrary element of H we may form the relative cyclic (D, #)-space 
(D-h,-E,hy). For he H, define o{h} =hh,. Then wo is clearly a (D, E)- 
operator homomorphism of H onto D-h,-E (=dHh,) and of{ho} =h. 
Hence (H,h,.) covers (D-h,- E,h,). 


Conversely, suppose that (M,m,) is a relative cyclic (D, £)-space such 
that, for every m,eM, (M,m,) covers (D-m,-E,m,). Let ho denote the 
identity mapping of M onto M, H the (D, F£)-space consisting of all (D, £)- 
mappings of M into M. Then we claim that (M, mo) is equivalent to (H, ho). 
Evidently, (H,h covers (M,m,) by the operator homomorphism y, where 
=h{mo}. Moreover, if h{mo}—0O then, since (M,m,) covers 
(D-m,:E,m;), we have also h{m,} for every m,eM. Hence 
- h{mo} =0 implies that h =0, i.e., y is an isomorphism, which proves our 
assertion. 

Thus, the full operator rings of arbitrary (D, £)-spaces may be regarded 


SoS a, 


mor 


se 
0 
as 
m 
(a 
in 
so 
sp 
( 
R( 
R( 
Mo 
M 
oe 
ind 
may 
= | 
of 
shov 


DOUBLE VECTOR SPACES OVER DIVISION RINGS. 455 


as special examples of relative cyclic (D, #)-spaces. In the case where J) 
and E are commutative, it is seen at once from the above characterization 
that every relative cyclic (D, #)-space is equivalent to its own operator ring. 

The study of relative cyclic (D, #)-spaces is greatly facilitated by the 
fact that there is a one to one correspondence between the equivalence 
classes of finite dimensional relative cyclic (D,E)-spaces and the finite 
dimensional (D*, *)-subspaces of the space of all additive homomorphisms 
of F into D. 

Let (M, mo) be a relative cyclic (D, £)-space, and denote by S(M) the 
set of all D-linear mappings of M into D, regarding M and D as vector spaces 
over D in the natural fashion. We shall regard S(M) as a (D*, E*)-space 
‘as follows: If oe S(M), de D, and ee E, we define d*-o as the composite 
mapping d*o (o followed by d*), and o-e* by setting, for me WM, 
e*){m} = o{m-e}. 

Now with each we associate an additive homomorphism o’ of 
into D, where o’{e} =o{m,-e}. Then (d*-o)’ = d*o’, and e*)’ =o'e*, 
so that the o’ for oe S(M) constitute a (D*, E*)-subspace R(M, m,) of the 
space of all additive homomorphisms of £ into D. The mapping o—0o’ is a 
(D*, E*)-operator isomorphism of S(M) onto R(M,m,). We shall call 
R(M, the relations space of (M, mo). 


THEOREM 5.1. If either M is of finite dimension [M:D] over D, or 
R(M, mo) is of finite dimension [R(M, m,.):D*] over D* then [M:D] 
—[R(M, my) : D*]. 


Proof. We can find a set (e,) of elements of # such that the elements 
Mo*€q constitute a basis for M over D. Let og be the D-linear mapping of 
M into D which is such that og{mo-: ea} = 8a. If [M:D] is finite and 
S(M) we have evidently = (o{mo- ea})*oa. Since the og are linearly 

a 
independent over D*, it follows that [S(M):D*]—[M:D]. Since the 
mapping o—o’ is a D*-linear isomorphism, this gives [R(M, mo) : D*] 
=[M:D]. If M is not finite over D, the o’, are infinitely many elements 
of R(M,m,) which are linearly independent over D*, so that R(M, mo) 
camot be finite over D*. This completes the proof. 


THEOREM 5. 2. (M, mo) covers (N, no) if and only if R(M, mo) 2 RIN, no). 


Proof. Suppose that (I,m) covers (N,n) by the operator homo- 
morphism h. Then, if re S(N), we have rhe S(M), and (rh)’ =7’ which 
shows that R(M, mo.) D R(N, no). 


456 G. HOCHSCHILD. 


Conversely, suppose that R(M,m,.) 2 R(N,m). Then if dgeD and 


E are such that mo: eg = 0 we have 
a 


a a a 


for all R(M, mo). A fortiori, dar’ {ea} = 0, for all r’e R(N, no), ie, 
t{ Sida: mo*ea} =0, for all re (VY). But this clearly implies 
oe | Hence there is a mapping h of M onto N such that 
hf 2 da* Mo* €a} = ~ da*%*@a, for all dge D and ege Thus, (M, m) 


covers (N,n.), and our proof is complete. 


Theorem 5. 2 implies that a relative cyclic space is determined to within 
an equivalence by its relations space. Thus, for instance, a necessary and 
sufficient condition for the relative cyclic (D, F)-space (M, m.,) to be equi- 
valent to the full operator ring of some (D, £)-space (cf. above) is that 
D’R(M, m,.)E’ = R(M,m,), where D’ denotes the ring of left multiplica- 
tions x —> d’{x} = dr effected by the elements of D in D, and similarly F 
denotes the ring of left multiplications in £. 

The following theorem establishes the one-to-one correspondence between 
relative cyclic spaces and spaces of additive homomorphisms which we 
announced above. 


THEOREM 5.3. Let T be a (D*, E*)-subspace of the space of all additive 
homomorphisms of E into D such that [T: D*] is finite. Then there exists 
a relative cyclic (D, E)-space (M,m,) such that R(M, =T. 


Proof. We may regard D as a vector space over D*, such that 
d,*-d=dd,. With this understanding, let M be the additive group of all 
D*-linear mappings of T into D. We make M into a (D, £)-space by 
setting, for de D, me M, UeT, and ee (d-m){U} —dm{U}, and 
(m-e){U} = m{Ue*}. Let mo be the mapping U > m.{U} = U{1}. Then 
mye M. We claim that D-m,.-E—=M. In fact, by Lemma 2.1, there exist 
elements - -,é@, in and a basis U,,- --,Un of T over D* such that 
Ui{e;} =4;;. If m is an arbitrary element of M we have 


t= 


n 
whence m= >) m{Ui}- 
i=1 


con 


O{ 
By 
(D 
rel 
cov 
the 
spa 
the 
Ey 
On 
imy 
whi 
spa 
(M 
4] 
cyel 
The 
R(¢ 
decc 
say 
D- 


DOUBLE VECTOR SPACES OVER DIVISION RINGS. 457 


Let U be an arbitrary element of 7, and define U{m}—m{U}. Then 
O{d:m} = dU{m},i.e., Ue S(M). Furthermore, U’{e} = U{my: e} = U{e}, 
ie, U=U’eR(M,m). Thus TCR(M,m,). But [R(M,m,): D*] 
=[M:D] = [T: D*], whence T = R(M, m). 


6. Least common covers and decompositions. 


Definition 6.1. Let (Ma, mq) be any set of relative cyclic (D, F)-spaces. 
By a least common cover of the (Mag, ma) we understand a relative cyclic 
(D, £)-space (P, po) which covers each (Mg,mq) and is covered by every 
relative cyclic space which covers each (Mg, mq). 

It is clear that, to within equivalence, there is at most one least common 
cover for a given set of (D,H)-spaces. We shall first prove an existence 
theorem : 


THEOREM 6.1. Every set of relative cyclic (D,E)-spaces has a least 
common cover. 


Proof. Let M be the direct sum of all the M,. Denote by P the (D, F)- 
space consisting of all (D, #)-mappings of the set (m,) into M. If po is 
the identity mapping of the set (mq) into M we have p, ¢ P, and D: p,: EH =P. 
Evidently, the relative cyclic (D,H#)-space (P, po) covers each (Mg, mq). 
Qn the other hand, if (Q,q0) covers each mq) then 45° —=0 


implies that dg- ma: eg—0, for each and hence that dg: po -eg= 0, 
B B 
which shows that (Q, qo) covers (P, po). 


THEOREM 6.2. Let (M,m,) and (N,n) be relatwe cyclic (D, £)- 
spaces of finite dimension over D. Let (P, po) be a least common cover of 
(M,m,) and Then R(P, po) = R(M, m) + R(N, nm). 


Proof. It is clear from Theorem 5.2 that R(P,p.) contains R(M, mo) 
+R(N, no). On the other hand, by Theorem 5. 3, we can construct a relative 
cyclic (D,E)-space (Q,q.) such that R(Q,q.) = R(M, + R(N, nm). 
Then (Q, qo) covers (M,m,.) and (N, mo), and hence also (P, 0), so that 


| B(Q,q0) > R(P, po), which gives the desired equality. 


Let (M, mo) be a relative cyclic (D, F)-space, and suppose that we can 
decompose M into the direct sum of a finite number of (D, £)-subspaces, 


ay M=M,+---+My. Then m—m+---+m,, with me Mj, and 


D: Mm; ° k= 
Let S; be the subspace of S(M) which consists of all o such that 


14 


' 


458 G. HOCHSCHILD. 


o{M;} = (0), for 741. Clearly, S(M) is the direct sum of the S;, and §, 
is isomorphic with S(M;) in a natural fashion. If oe S(M) we write 
with oe S8;. Then we have o{m,:e} =o, {m,-e}+--. 
+ os{m -e}, which shows that each o’¢ R(M,m,) may be written in the 
form o’ =o,’+---+-+ 0’, where o;’ is the element of R(Mi,m;) which 
corresponds to o;. Moreover, it is clear from what we have said that if there 
is arbitrarily given, for each index 7, an element p; of R(Mi, mi) we can 
find oe S(M) such that oi’ = pi, for each 1. Hence R(M, mo) is equal to 
the sum of the R(M;, m;). Finally, we claim that this sum is direct. In fact, 
if we have o,’-+- - -+ 0.’ =0 we obtain from the above o{m- e} =0, for 
all ee H, whence o = 0, and therefore each o,;’ = 0. 

In the finite dimensional case, we can obtain the following complete 
result : 


THEOREM 6.3. Let (M,m,) be a finite dimensional relative cyclic 
(D, E)-space. Then M is decomposable if and only tf R(M,m,) ts decon- 


posable, and the number of indecomposable components is the same for each. | 


Proof. There remains only to show that a decomposition of R (M, m,) 
leads to a decomposition of M. Suppose that R(M, m,) is the direct sum of 
two (D*, E*)-subspaces T, and T,. From the proof of Theorem 5. 3, we 
know that (M,m,) is equivalent with a relative cyclic space (N, 1), where 
N is the space of all D*-linear mappings of R(M,m,) into D. Clearly, V 


is the direct sum of two (D, £)-subspaces N, and N2, where N, is the set 
of all elements of N which map 7; into (0). This, together with the above, | 


evidently suffices to establish Theorem 6. 3. | 


7. Products. 


Definition 7.1. Let C and D be division rings, H an arbitrary ring 
with an identity element. Let (M,m,) be a relative cyclic (C, D)-space, | 
(N,m) a relative cyclic (D,#)-space. Construct the Kronecker product 
(C, E)-space MX N with respect to D, and let LZ denote the subspace 
C- (mo X m)-E. Then the pair (L,m. X no) is called the product of 
(M,m,) by (N, 1). 


THEOREM 7.1. Let (M,m,) and (N,n ) be as above, and assume that 
M is finite over C and N is finite over D. Let (L, mo X no) be their produtt. 
Then R(L, mo XK nm) = R(M, mo) R(N, no)- 


Proof. For every reS(N) we can define a C-linear mapping 7 o} 


| 
| 
| 
| 
| 
( 
| | 
§ 
e 


DOUBLE VECTOR SPACES OVER DIVISION RINGS. 459 


M XN into M such that 7{m KX n} =m-r{n}. Then, for every ce S(M), 
we have ore S(M X NV). Let p be the restriction of of to ZL, and denote by 
p the corresponding element of R(L,mo Xm). Then we have clearly 
p which shows that R(M, m.)R(N,m) C R(L, mo X no). 

Now choose elements d,,- --,d, in D and ¢é,,: --,e: in E such that 
the mo: d; and n- e; constitute bases for M over C and N over D, respectively. 
Then the elements mo: dj X no: e; constitute a basis for M < N over 
C. We have m X n° Mo X Mo* ej 


{e} di X ej, the 7’; R(N, no) ond the o’; R(M, m)). 


by Lemma 2.1, we can find elements f:,---,f: in E and a basis 
*5pt for R(M,m,.)R(N, over C* such that p’p{fg} —8pq. Then 
t 
we may write = > c*ijpp’p, with cijpe C, and 
r=1 


Mo X Dp ple}cijp* Mo* di K No- ej. 
i,j.p 
Hence 


Mo XK No* fa= Dd Mo* di XK j, 
if 


and, on substituting back, mo X no: e => p'p{e}- mo XK fp. If ye S(L) 
this gives y’{c} y{fo}*p’», whence R(L, my X no) 
p 
CR(M, m.)R(N, no). This completes the proof. 
From now on, let £ denote a division ring, and D a division subring of E. 
Then, if (M,m,.) and (N, m9) are relative cyclic (D, Z)-spaces we may form 


their product with respect to D, as in Definition 7.1, yegarding M as a 
(D, D)-space. 


Definition %.2. The relative cyclic (D,#)-space (M,m,) is said to be 
closed if it covers the product of (M,m,) by (M, m). 
By Theorems 7.1 and 5.2, we have immediately: 


THEOREM 7.2. If M is finite over D then the relative cyclic (D, E£)- 
space (M, my) is closed if and only if R(M, mo) ts a ring. 


Denote by K(M,m,) the division subring of D which consists of all 
elements de D for which d: my) = my: d. It is clear that K(M, m,) coincides 
with the set of all elements de D whose left multiplications e—>de in EF 
commute with all the elements of R(M,m,). This fact can be used to prove 
the following result: 


THEOREM 7.3. Suppose that (M, mo) is closed and finite over D. Then 


8; 
Tite 
the 
hich 
here 
can 
1 to 
act, 
for 
ach. | 
my) 
n of 
we | 
ere 
set | 
ce | 
luct 
of 
that 
‘uct. 


460 G. HOCHSCHILD. 


R(M, mo) is the ring of all K(M, m,)-linear mappings of E into D, and 
K(M, m,)] = [M: D]. 


Proof. The first part follows at once from Theorems 7. 2 and 2.1, using 
the above remark. Further, from Theorem 2.1, we have that £ is finite over 
K(M,m,) and K(M, mo)] =[R(M, mo): D*], whence the result, by 
Theorem 5. 1. 


CoroLuary. Jf (M, mo) and (N,no) are closed and finite over D then 
they are equivalent if and only if K(M, mo) = K(N, no). 


The property of closure characterizes the Kronecker products, as follows: 


THEOREM 7.4. Let E be a division ring, D a division subring of BE, 
and K a division subring of D over which E ts finite. Then tf DX E denotes 
the Kronecker product with respect to K, the relative cyclic (D, E)-space 
(D X E,1X 1) ts closed. Every closed relative cyclic (D, E)-space (M, m) 
such that M is finite over D is equivalent with a Kronecker product (D X E£, 
1X1), where DX E is taken with respect to K(M, m)). 


Proof. Evidently, if (N,m) is any relative cyclic (D, £)-space such 


that K(N,n.) ~ K, then (D X E,1 X 1), where D X E is taken with respect | 


to K, is a cover of (N,n)). Taking for the product of (Dx 
1X1) by itself, we see that (D X £,1X1) is closed. Since K(D XE, 
1X 1) = K, the rest now follows from the above Corollary. 

HARVARD UNIVERSITY. 


| 
( 
¢ 
I 
( 
a 


EXTENSION TYPES OF ABELIAN GROUPS.* 


By REINHOLD Bakr. 


When studying the manifold of extensions of the group S, we shall employ 
two principles of classification : equivalence classes and extension types. Here 
we say that the extensions G and H of S are equivalent extensions of S, if 
there exists an isomorphy between G and H which leaves invariant every 
element in S; and the extensions G and H of S belong to the same extension 
type of S, if there exists a homomorphism of G into H and a homomorphism 
of H into G which both leave invariant every element in S. It is clear that 
equivalent extensions belong to the same extension type. If on the other 
hand G is some extension of S, then direct sums G @T will, in general, 
not be equivalent extensions of S, though they all belong clearly to the same 
extension type. It is just this observation which justifies the introduction 
of extension types; for in constructing extensions of the group S the step 
from the extension @ to the extension G @ T will be considered a trivial step. 
The extension types have another advantage. They form a partially ordered 
set, if we define: the extension type of the extension G of S is smaller than 
or equal to the extension type of the extension H of S, if there exists a 
homomorphism of G into H which leaves invariant every element in 8S. The 
first problem then is to give an internal characterization of this partially 
ordered set of extension types. 

It is advantageous to begin such an investigation with the discussion 
of a somewhat more general problem: given an extension G of some group 
Sand a homomorphism » of S into some group H; to find conditions necessary 
and sufficient for the existence of a homomorphism of G into H which induces 
7 in §. [If » happens to be an isomorphism, as it will be in our applications, 
then one may ask the deeper question under which circumstances 7 is 
induced by an isomorphism of G upon H.] To answer such questions one 
has to find invariants of the extension G of S; and these invariants can be 
used to characterize equivalence classes and the partially ordered set of 
extension types. 

Little is known as yet concerning the problems which we have outlined. 
Thus we shall give here a complete and concrete solution of them for a 
restricted class of extensions only. We term the abelian group G a “ little” 


* Received September 5, 1947; revised September 28, 1948. 


ind 

ing } 
ver 
by 
ven 
VS: 
E, 
tes 
ce 
E, 

ich 
ect | 
E, 
E, | 

| 

461 


462 REINHOLD BAER. 


extension of its subgroup S, if G/S does not contain elements of order 0, 
and if the presence of an infinity of elements of order a prime p in @ implies 
the absence of elements of order p in G/S. The partially ordered set of 
types of little extensions of S may then be characterized by means of systems 
of Loewy chains of S [3, Characterization Theorem]. One may show further. 
more that the little extensions G and H of S belong to the same type if, and 
only if, G and H possess direct summands which are equivalent extensions 
of S; and that G and H are equivalent extensions if, and only if, they belong 
to the same type and are isomorphic [3, Theorem 1 and 1, Equivalence 
Criterion ]. 

The restriction to little extensions has been dictated mainly by the 
method of proof used. Further restriction, say to finite groups, would have 
gained little in simplicity, since in particular the rather intricate proof of 
the existence of extensions with given invariants [in section 2] derives its 
main complications from‘a situation which arises already with finite groups. 

There appears to be little connection between what is usually termed a 
group extension and the study of extensions as proposed here. This will be 
quite clear, if one just inspects the definitions: an extension of the group 8 
is any group which contains S as a subgroup; a group extension, however, 
is a pair (G,7) where 7 is a homomorphism of the group G@ and G is then 
termed an extension of the kernel of » by the image of G@ under 7. 


Notations. 


The groups considered will be abelian groups throughout; and their 
composition will be addition e+ y. The order of the group element z is 0, 
if nz = 0 implies n = 0; and otherwise the order of x is the smallest positive 


integer n such that nz = 0. 


K[p, X] = set of all the elements z in the group X such that pr=—0. 


If z is an element in, and T a subset of, the group X, then {z} and {T} 
are the subgroups of X which are generated by z and T respectively. 


V () W =cross cut of the sets V and W. 
V @ W = direct sum of groups V and W. 
V = W signifies isomorphy of groups V and W. 


The group P is of type p® [in the sense of Priifer], if it is generated 
by elements g(i) subject to the relations: pg(0) =0, pg(i+1) =g(t). 
Here, as always, p indicates a prime number. 


on 


he 
re 
is 

9 
of 
( 
th 
up 
pr 
is 
(a 
sys 
pa 
the 
int 
(i 
pri 
wh 
bel 
Co 


EXTENSION TYPES OF ABELIAN GROUPS. 463 


Since all the groups considered are abelian, sums and differences of their 
homomorphisms and endomorphisms may be defined in the customary fashion. 


1, Extension of homomorphisms and isomorphisms. If S is a sub- 
group of the [commutative] group G, then we say alternately that G@ is an 
extension of S. If furthermore o and y are homomorphisms of S and G 
respectively such that zo = xy for every element x in S, then we say that o 
is induced by y in S and that y ts an extension of o. 


PROPOSITION 1. Suppose that the subgroup S of the group G and the 
group H satisfy the following conditions: 


(a) No element in G/S has order 0. 


(b) If G/S contains elements of order [the prime] p, then H contains 
only a finite number of elements of order p. 


Then the homomorphism o of S into H ts induced by a homomorphism 
of G into H tf, and only ¢f, 


(*) p'@)o S for every prime power pt. 


Proof. The necessity of condition (*) is an immediate consequence of 
the fact that homomorphisms of @ into H map k-folds of elements in G 
upon k-folds of elements in H. 

Assume now the validity of condition (*). For the purposes of the 
present proof it will be convenient to say that the group F between § and G 
is a finite extension of S, if F/S is a finitely generated group. Because of 
(a) this is equivalent to the assumption that F/S is a finite group. The 
system ® of all the finite extensions F of § [which are contained in G@] is a 
partially ordered set with respect to the containedness relation. If F is in %, 
then we denote by 6 the set of all the homomorphisms of F into H .which 


induce o in 8S. We prove first 
(i) Or is, for F in ®, finite, but not vacuous. 


Since F/S is a finite group, it is the direct sum of cyclic groups of 


prime power order: 


F/S={S +2} 


where S + z; is an element of prime power order m; in F/S. Then mj; 
belongs to S and consequently to Sf) miG. Hence (mj4zi)o belongs, by 
Condition (*), to (S [1 miG@)o S mH; and this implies the existence of an 


0, 
ies 
of 
ms 
er- 
nd 

ns 
ng 
ce 
he 
ve 
of 
its 
08. 

a 
pe 

ir 
e 
} 


464 REINHOLD BAER. 


element ¢; in H such that (mjz;)o—miti. One verifies now without 
difficulty the existence of one and only one homomorphism ¢ of F into H 
which induces o in S and satisfies = ¢;, since F = {S8,2,,- +, 2n} and 
since F'/S is the direct sum of the cyclic groups {S + 2;}. Thus we have 
shown the existence of at least one homomorphism ¢ in 6. 

If F/S contains elements of order a prime p, then it follows from 
Condition (b) that H contains only a finite number of elements of order p, 
If m is a power of p, then this implies that the equation mz = 0 has only 
a finite number of solutions in H. This we may apply on each of the 
m; 1 so that each of the equations m,a—0 has only a finite number of 
solutions z in H. 

If ¢ and y are homomorphisms of F into H which both induce og in S, 
then (mizi)d = (miz)y, since belongs to S. Hence mizi(¢— y) =0 
for every 1. It follows from the preceding paragraph of the proof that there 
exists only a finite number of elements in H which may be images of z; under 
the homomorphism ¢— y of F into H, namely the finitely many solutions 
of the equation mjz—0 in H. Since x(¢—y) =O for every x in S, one 
sees that the number of different homomorphisms ¢—y with ¢,y in 6 must 
be finite [there exists only a finite number of homomorphisms of F into H 
which map S upon 0]. But then 6, itself is finite; and this completes the 
proof of the property (i). 

(ii) Jf F’ and F” are in ®, and if F’ S F”, then every homomorphism 


in Or induces a homomorphism 7! in 


This is an obvious consequence of our definitions. The mapping of 
onto 7’, described in (ii), is the projection of Or into 6p. 


(iii) Jf U and V are in ®, then U+ JV its in ©. 


This is obvious. 

(iv) If ave in then the projection of Of into Or is 
the product of the projection of Or into Or by the projection of Oy into Op. 

This is an obvious consequence of the definition of projection as given 
in (ii). 

(v) There exists at least one single valued function n(F) with the 
properties : 

(v.a) is in 6p for every F in ®; 

(v.b) If PSF” are in then (F”’) is induced by n(F”). 


th 


( 
t 
] 
i 
a 
a 
( 
2 
( 
t 
1 
t 
in 
a 
of 


EXTENSION TYPES OF ABELIAN GROUPS. : 465 


Kurosh (2) has named such a function 7(F) a complete projection set 
of our inverse system. The existence of such a complete projection set may 
either be deduced from the results of Kurosh (1), (2) or from Steenrod (1), 
Lemma 2.1, p. 665 [since finite sets are bicompact spaces]. 

If »(F) is some function with the properties (v.a) and (v.b), then 
there exists one and only one homomorphism 7 of G into H which induces 
7(F) in F for every F in ®, since every finite subset of G is contained in at 
least one F in ® But this homomorphism y of G into H induces o in S, 
since this is done by every »(F'). This completes the proof. 

In the set (G—H:;c) of all the homomorphisms of G into H which 
induce o in S a topology may be introduced in the customary way: if 7 is a 
homomorphism of G into H which induces o in S, and if a(1),- - -,a(n) 
are a finite number of elements in G, then a neighborhood [the a(1)-,-: - -, 
a(n)-neighborhood | of is formed by all the homomorphisms « in (G H; 0) 
which satisfy a(1)y for i=1,---,n. 

It follows from the last step of the proof of Proposition 1 that the space 
(G— H;o) is essentially the same as the space of all the functions »(/) 
with Properties (v.a) and (v.b). If we apply now Steenrod (1), Theorem 
2.1, p. 666 [whose applicability is assured since finite sets are bicompact 
spaces] it follows that the space of functions »(F) with Properties (v.a) and 
(v.b) is a bicompact and non-vacuous space. Since this space is essentially 
the same as the space (G@— H:o), we have shown the following fact. 


Corotuary 1. Jf o is a homomorphism of the subgroup S of the group G 
into the-group H such that 


(a) no element in G/S has order 0, 


(b) the existence of elements of prime number order p in G/S implies 
that there exists at most a finite number of elements of order p in H, 


(ce) (Sf) pi@)oS for every prime power p', 


then the space (G—>H;c) of all the homomorphisms of G into H which 
induce a in S is bicompact and non-vacuous. 


In order to be sure of the validity of conditions (a), (b) of Proposition 1 
and Corollary 1 we introduce the following concept. 


DEFINITION 1. The extension G of its subgroup S is a little extension 
of S, if 


(a) G/S does not contain elements of order 0 and 


d 
\e 
of 
0 
re 
r 
is 
e 
st 
H 
le 
4s 
en 
he 


466 REINHOLD BAER. 


(b) the existence of elements of prime number order p in G/S implies 
that there exists at most a finite number of elements of order p in G. 


If G is a little extension of its subgroup S, and if K is between G and 8, 
then K is a little extension of S too. 

If G is a little extension of its subgroup S, and if G@ contains an infinity 
of elements of order p, then every element of order p in G@ belongs to 8; 
and x belongs to S whenever pa belongs to S, so that Sf) pG@= pS. This 
very useful property may be used to prove that little extensions of little 
extensions are little extensions. 

Finally we note that the following fact is easily verified. The group ¢ 
is a little extension of each of its subgroups if, and only if, G contains only 
a finite number of elements of every given order. 


CoroLLarRy 2. S is a direct summand of its little extension G if, and 
only tf, 
(t) piS=S1) piG for every prime power p'. 


Proof. We note first that piS = 8 () p'@ is always true. Hence Con- 
dition (+) is equivalent to the condition 


(tt) SQ) piG S piS for every prime power 


It is well known that S is a direct summand of G if, and only if, there 
exists a homomorphism of G upon S which leaves invariant every element 
in 8. Thus S is a direct summand of G if, and only if, the identity mapping 
of S may be extended to a homomorphism of G into 8. Since G is q little 
extension of S, we may apply Proposition 1 to this extension problem; and 
it follows from Proposition 1 that the identity mapping of S is induced by a 
homomorphism of G into S if, and only if, (t+) is true. This shows the 
validity of our corollary. 


Example 1. If G is some commutative group, then we may form the 
subgroup P(G) of all the elements in G whose order is not 0. Then every 
element, not 0, in G/P(G) has order 0; and so Condition (+) of Corollary ! 
is satisfied by the subgroup P(G). But it is well known that there exist 
groups G such that P(@) is not a direct summand of G. Consequently it is 
impossible to omit Condition (a) from Definition 1 without invalidating 


Corollary 2. 


Example 2. Suppose that {zi} is a cyclic group of order pi and that 
@ is the direct sum of the cyclic groups {zi}. Denote by S the subgroup 


sin 
Sin 
it 
in | 


[Be 
The 


case 
= 


I 
t 
0 
d 
m 
pe 
in 
If 
pr 
ex] 
p. 
of 
sol 


EXTENSION TYPES OF ABELIAN GROUPS. 467 


of G which is generated by the elements 2; — pzi,,. It is easy to see that 
these elements form a basis of S, that G/S is a group of type p® [in the 
sense of Priifer] and that Condition (+) of Corollary 1 is satisfied by the 
subgroup S of G. But S cannot be a direct summand of G, since G does 
not contain subgroups of type p®. Consequently it is impossible to omit 
Condition (b) from Definition 1 without invalidating Corollary 2. 

Since Corollary 2 is essentially a special case of Proposition 1, one 
realizes that Examples 1 and 2 show the impossibility of omitting in Proposi- 
tion 1 the conditions imposed on S, G and H. 

The following “splitting criterion” will prove useful on various 


occasions. 


LemMa 1. Jf G ws a little extension of its subgroup 8S, tf the endo- 
morphism + of G leaves invariant every element in S, then there exists a 
direct decomposition G=G’ @G” such that SSG’, r induces an auto- 
morphism in G’ and such that every element in G” is annthilated by a suitable 


power of r. 
Proof. We begin by proving: 
(a). If z is an element in G, then the set z, z7,- - -,zr‘,- - - of elements 


in G is finite. 

This is quite obvious, if z is in S, since then this set is a one element set. 
If z is not in S, then the order of z modulo S is an integern>1. If pisa 
prime divisor of n, then G/S contains elements of order p; and the little 
extension G of S contains therefore only a finite number of elements of order 
p. Since G contains, for every prime divisor p of n, only a finite number 
of elements of order p, it follows that G contains only a finite number of 
solutions of the equation ny=0. Since nz belongs to S, we have 


n(z—2rt) = nz — (nz)rt =0, 


since elements in § are left invariant by every power of the endomorphism +. 
Since G contains only a finite number of solutions of the equation ny = 0, 
it follows that z— zr‘ can assume only a finite number of different values 
in G; and this completes the proof of (a). 

An endomorphism + with the property (a) we have termed elsewhere 
[Baer (1), p. 512] almost periodical; and we have shown [Baer (1), 
Theorem B, p. 512] that almost periodical endomorphisms “ split.” In the 
case of abelian groups this implies the existence of a direct decomposition 
G= @’ @ G” with the following properties: + induces an automorphism in 


468 REINHOLD BAER. 


G’, and to every element z in G@” there exists a positive integer k = k(z) 
such that ar*=0. If w is an element in S, then u=w’+ wu” where w’ 
belongs to G and wu” belongs to G’. Since wu is left invariant by every 
power of 7, and since uw’ is mapped upon an element in G’ by every power of r, 
we find that w= — ig an element in 


Hence S = G’; and this completes the proof. 


Proposition 2. If G and H are little extensions of their subgroups § 
and T respectively, then the following properties of the isomorphism o of § 
upon T are equivalent: 


(i) pi@)o=T piH for every prime power pi. 
(ii) o is induced in S by a homomorphism of G into H and o is 
induced in T by a homomorphism of H into G. 


(iii) There exist direct decompositions G = G’ ® G” and H = H’ @ H” 
such that SSG’, TS H’ and such that o is induced in S by an isomorphism 
of G’ upon H’. 


Proof. We show first: 


(a) If @ or H contains an infinity of elements of order p, then neither 
G/S nor H/T contains elements of order p. 


Assume, for instance, that G contains an infinity of elements of order p. 
Since G@ is a little extension of S, this implies that G/S ‘does not contain 
elements of order p. Consequently all the infinitely many elements of order 
p in G@ belong to 8S. But S and T are isomorphic; and so T [and hence J] 
contains an infinity of elements of order p. But H is a little extension of 7; 
and consequently H/T does not contain elements of order p. Hence (a) 
is true. 

If (i) is satisfied by then (S p'G)o S piH and (T (17) piH)o? S piG 
for every prime power p‘. Hence we may infer from (a) and Proposition 1 
[noting that G/S and H/T are free of elements of order 0] the existence 
of a homomorphism of G into H which induces o in S and the existence of a 
homomorphism of H into G which induces o* in T. Thus (ii) is a conse- 
quence of (i). 

Assume next the validity of (ii). Then there exists a homomorphism y 
of G into H which induces o in S and a homomorphism 7 of H into G 
which induces o* in JT. Then + = yy is an endomorphism of G which leaves 
invariant every element in S. We infer from Lemma 1 the existence of a 


direct decomposition 
G=G4’ S=@ 


( 
b 
0 
t 
t 
Nie 
| 
cl 
t 
80 
C0 
t 
isc 
fo 
co 
fa 
T 
ho 
the 


EXTENSION TYPES OF ABELIAN GROUPS. 469 


such that + induces an automorphism in G’ and such that every element in 
@’’ is annihilated by a suitable power of +. We let H’ = G’y and we denote 
by H” the set of all the elements in H which are annihilated by some power 
of yy. It is easy to see that H’ and H” are subgroups of H. Furthermore 
T=Sc=—SyS G’y=H’. If is an element in G’ such that zy —0, 
then But yy induces an automorphism in G’; and hence x= 9. 
Thus we have shown that y induces an isomorphism of G’ upon H’ and that 
this isomorphism induces o in S. Suppose now that y is an element in 
H’(\ H”. Then there exists an element z in G’ such that zy = y and there 
exists a positive integer n such that y(my)"=—0. Hence z(yn)"*? = y(ny)"9 
=(. But yy induces an automorphism in G’ so that z= 0. Hence y = zy = 0 
so that H’ () H” =0. If r is an element in H, then ry is an element in G. 
Hence ry = g’ + g” with g’ in G@’ and g” in G”. Since yy induces an auto- 
morphism in G’, there exists an element b in G’ such that g’ = byy. It is 
clear that by =7” belongs to H’. Let r’=r—r’. Since g” belongs to @”, 
there exists a positive integer m such that g”(yn)”" =0. Hence 


(ny) = — (yn) ™y = (9 + 9” — byn) (yn) 
= 9" (yn)™y = 9, 


so that 7” belongs to H”. Consequently r belongs to H’ + H”; and this 
completes the proof of the fact that HH’ @H”. Thus we have shown 
that (iii) is a consequence of (ii). 

If,finly OG’, H=H’ @H”, S=G@’, TH=H’, and if the 
isomorphism a of G’ upon H’ induces o in S, then we have: 


(SN pi@)o = (8 piG’)o = (SN piG’)a= TN pt’ =T p'H 


for every prime power p‘. Hence (i) is a consequence of (iii); and this 
completes the proof. 

In order to improve upon the preceding result we need the following 
fact. 


Lemma 2. If G and H are isomorphic little extensions of their sub- 
groups and T respectively, if G=G@’ @G”, SS @ and H=H’ OH", 
TS H’, then the itsomorphy of G’ and H’ implies the isomorphy of G” and H”. 


Proof. Since G/S does not contain elements of order 0, and since 
G” = G/G’ = (G/S)/(G’/S), it follows that G” [and likewise H’’] does 
not contain elements of order 0. If @” contains elements of order p, then 
G/S [= (G@’/S) ® G”] contains elements of order p too; and this implies 
that G contains only a finite number of elements of order p. Likewise we 


| 

r 
1 
r 
| 
| 


470 REINHOLD BAER. 


see that H contains only a finite number of elements of order p, if H” 
contains elements of order p. 

If X is any group, then denote by X, the set of all the elements of 
order a power of the prime p. This p-component [or primary component] 
X> is a subgroup of XY; and the absence of elements of order 0 in XY implies 
that X is the direct sum of its primary components. 

If G” or H” contains elements of order p, then we infer from the 
isomorphy of G and H that G and H both contain only a finite number of 
elements of order p. Thus Gp = G’, and Hy = H’, whenever G and H contain 
an infinity of elements of order p; and in this case both G”’, and H”, are 0. 

Assume now that G@ and H contain only a finite number of elements of 
order p. Then G,=G’, @ and @ H”,. From the isomorphy 
of G and H we infer the isomorphy of G, and H,; and from the isomorphy 
of G’ and H’ we infer the isomorphy of G’, and H’y. But G, and H, contain 
only a finite number of elements of order p; and thus they are direct sums of a 
finite number of essentially uniquely determined p-groups of rank 1 [cyclic or 
of type Hence the isomorphies ~ H’, and G”’, H’, © H", 
imply the isomorphy of G”, and H”,. Thus we have shown that corresponding 
primary components of G” and H” are isomorphic; and this implies the 
desired isomorphy of G” and H”. 


Proposition 3. If G and H are little extensions of their subgroups 8 
and T respectively, then the isomorphism o of S upon T. is induced by an 
isomorphism of G upon H if, and only if, \e 

(a) (Sf) p'G@)o=—T p'H for every prime power p* and 

(b) G—H. 


Proof. The necessity of these conditions is almost obvious. If (a) and 
(b) are valid, then we infer from (a) and Proposition 2 that there exist 


direct decompositions G=G’ @G” and H =H’ @H” such that SSG, | 


T =H’ and such that o is induced by an isomorphism 7’ of G’ upon H’. 
It follows from (b) and the preceding Lemma 2 that G” and H” are iso- 
morphic; and consequently there exists an isomorphism of G@ upon H which 
induces 7’ in G’ and o in 8. 


EQUIVALENCE Criterion. The little extensions G and H of their common 
subgroup S are equivalent extensions of S if, and only if, 


(a) Sf) for every prime power p* and 
(b) G=H. 


a 
| 


EXTENSION TYPES OF ABELIAN GROUPS. 471 


Here we term as usual the extensions G and H of S equivalent, if there 
exists an isomorphy of G upon H which leaves invariant every element in 8. 
The criterion is an obvious special case of Proposition 3. 

It should be noted that we may substitute in many cases for conditioa 
(b) the weaker condition 


(b’) G/S~H/S; 
but that in general this is impossible, as may be seen from easily constructed 
examples. 


DEFINITION 2. The group G is a minimal extension of its subgroup 8S, 
if G is a little extension of S and if G=G’ @ G” and SSG’ imply G’” =0 
and G= 


The importance of this concept stems from the following simple 
observations. 


CoroLiaRy 3. Suppose that G and H are minimal extensions of their 
subgroups S and T respectively. 


(a) The isomorphism o of S upon T is induced by an isomorphism of 
G upon H if, and only if, 


(Sf) piG)o=T () p'H for every prime power pi. 


(b) The homomorphism y of G into H which satisfies Sy =T 1s an 
isomorphism of G upon H if, and only if, » induces an tsomorphism of S 
upon T such that 


(S 1) piG)y=T pA for every prime power pi. 


Proof. (a) is an immediate consequence of Proposition 2 and Definition 
2. The necessity of the conditions of (b) is obvious. If the conditions of 
(b) are satisfied, then we infer from (a) the existence of an isomorphism r 
of H upon G which induces the isomorphism 7* in JT. Then yr is an endo- 
morphism of G which leaves invariant every element in S; and it follows 
from Lemma 1 and Definition 2 that yr is an automorphism of G. Conse- 
quently » is an isomorphism of G upon H. 


Proposition 4. Jf G is a little extension of its subgroup S, then there 
exists one and essentially only one direct decomposition G = G’ @ G” such 
that G’ is a minimal extension of S. 


Here we term the direct decompositions G = H’ @ H” = K’ © K” such 
that S< H’ and S< K’ essentially the same [with respect to S], if there 


of 
t] 
he 
of 
in 
0. 
of 

y 

in 

a 
or 
ng 
he 

8 

n 
id 
st 
0- 
h 
mn 


472 REINHOLD BAER. 


exists an automorphism of G@ which maps H’ upon Kk’, H” upon K” and 
which leaves invariant every element in S. 


Proof. Denote by ¢ the identity mapping of S. Then it follows from 
Corollary 1 that (G@-—> G;«) is a bicompact topological space. Every element 
in (G@—G;«e) is an endomorphism of G and some of these are idempotent 
[7 = 7°]; for instance the identity mapping of G. One verifies easily that 
the set = of all the idempotents in (G—>G:;e) is a non-vacuous and closed 
subset of (G@—G:e) so that = too is a bicompact topological space. 

If the direct summand D of G contains 8, then there exists an idem- 
potent endomorphism of G which maps G upon D and leaves invariant every 
element in D and, therefore, in S. It follows that the cross cut =f) (@— D;.) 
is not vacuous—note that (G—D;«) may be considered as a subset of 
(G—>G;e). It follows from Corollary 1 that (@—~D;«e) is a bicompact 
topological space. Hence (G@—D;e) is a closed subset of (@— G;e) and 
this implies that 


Ep ==!) is a non-vacuous closed subset of (G 


whenever D is a direct summand of G which contains 8S. 

[Note that elements in Ep are idempotent endomorphisms of G which 
leave invariant every element in S and which leave invariant only elements 
in D.] 

Consider now a well-ordered descending sequence D(v) of direct sum- 
mands of G which all contain S. Then the sets Zp,v) form e. well ordered 
descending sequence of non-vacuous, closed subsets of the bicompact space 
(G—G;«). Their cross cut is consequently not vacuous. If « belongs to 
the cross cut of all the Zp,v), then x is an idempotent endomorphism of G 
which leaves invariant every element in S and which leaves invariant only 
elements belonging to the cross cut of all the D(v). The image Gx is conse- 
quently a direct summand of G which contains § and which is contained 
in the cross cut of all the D(v). Consequently there exists a minimal direct 
summand G’ of G which contains S; and it is clear that G’ is a minimal 
extension of S. Since G’ is a direct summand of G, we have G = G@’ @ @”; 
and thus we have shown the existence of at least one decomposition 
G = G’ © G” such that G’ is a minimal extension of 8. 

Suppose now that 

G= H’ @ H” = kK’ © Kk” 


are direct decompositions of G with the property that H’ and K’ are both 
minimal extensions of S. This implies clearly 


of 


ché 


is 


whe 


of 
q 
H 
T 
ch 
fin 
ps 
eas 
We 
nu 
[pt 
gro 


EXTENSION TYPES OF ABELIAN GROUPS. 
p'G =S p'K’ for every prime power p*; 


and now we infer from Corollary 3, (a) the existence of an isomorphism 7’ 
of H’ upon K’ which leaves invariant every element in 8. 

Now we infer from Lemma 2 the isomorphy of H” and K”. Conse- 
quently there exists an automorphism of G@ which induces 7’ in G’ and maps 
H” upon K”; and this automorphism leaves invariant every element in S. 
Thus we have shown that the direct decompositions G—H’ @ H” and 
(= K’ ® K” are essentially the same; and this completes the proof. 


2. The existence of extensions with given invariants. The invariants 
of extensions which we consider are chains of subgroups of the following type. 


DEFINITION 1. The subgroups S(i) of the group S form a p-Loewy 
chain [p a prime], if S(0) S and pS(t) SS(i+1) SS8(t) for 


If G is an extension of S, then the intersections S [) p'G form a p-Loewy 
chain of S. It would easily be possible to define such chains also with trans- 
finite terms; but for our present purposes this is not necessary. 

S(t)/S(t+1) is a direct sum of cyclic groups of order p, since 
pS(it) = S(i+1). This justifies the term Loewy chain. One verifies 
easily that p/S(t) = S(i+ j) and that in particular piS 8(j). 

We note finally that the subgroups p‘S for 0 <7 form a p-Loewy chain. 
We shall refer to this chain as to the trivial p-Loewy chain of 8. 


Lemma 1. If G@ is an extension of the group S, and tf p is a prime 
number, then 


= K[p, (81 p'@) + 8 1 + K[p, p*G@]) for 0 Si. 


Proof. If we let K — K[p, G], then X (] K = K[p,X] for every sub- 
group X of G. If we remember this, and if we let in Ore’s formula [Ore (1), 
p. 174, (17)] A= K, B=S) p'G, C = p*“G, then we find that the group 


U=K[p, (SQ piG) + pi*G]/(K[p, 8 p'G] + K[p, p**G]) 
is isomorphic to the group 

V=((K+ (8M p'G)] [K + + 
where we use the fact that (S piG) 1 pi?'G@ 

Consider now the endomorphism of G which maps z upon pz. If X is 


15 


474 REINHOLD BAER. 


a subgroup of G, then K +X is the inverse image of pX under this endo. 
morphism; and one sees easily that (KA +X) (K+ Y) is the inverse 
image of pX [| pY [where X and Y are subgroups of G]. This implies the | 
isomorphy of the groups V and 


W=([p(S1) piG) pi? G]/p(S pi@). 


Thus U and W are isomorphic groups, as we intended to show. 


Corotuary 1. Jf G is an extension of the group S, and if p is a prime | 
number, then [p(S 1 piG) (SQ pi*G@)]/p(S is isomorphic to | 


a subgroup of K[p, p'G|/K[p, 


Proof. It is an immediate consequence of Lemma 1 that the group 
[p(S 1 piG) p*?G@)]/p(S p*G@) is isomorphic to a subgroup of 
K[p, p'G]/(K[p, 8M piG] + K[p, p**G@]); and this group is a_homo- 
morphic image of K[p, p'G]/K[p, p**G]. But every homomorphic image of 
K[p, p*G]/K[p, is isomorphic to a subgroup of K[p, p'G]|/K[p, p*"@] 
so that the first of our groups is isomorphic to a subgroup of the last one. 


EXISTENCE THEOREM. Suppose that there is given, for every prime », 
a p-Loewy chain S(1; p) of the group S. Then there exists a little extension 
G of S such that ptG p) for every prime power pi if, and only 
the following conditions are satisfied by these Loewy chains. 


(N) If the p-Loewy chain S(i;p) is non-trivial, then 

(a) SS contains only a finite number of elements of order p: 

(b) [pS(isp) 2; p)]/pS(i+ 1; p) is finite for every i; and 
(c) [pS(isp) S(i +25 p)1/pS(i-+ =O for almost every i 


_ Remark. If the p-Loewy chain S(t; p) is trivial, then conditions (b) 
and (c) are automatically satisfied, whereas the validity of (a) is not required. 


Proof of the necessity of Condition (N). Assume the existence of a 
little extension G of § such that S{) p'G@—=S(i; p) for every prime power 
pi. If G/S does not contain elements of order p, then G/S does not contain 
elements of order p‘ and S) piG = piS. Thus the absence of elements of 
order p in G/S implies the triviality of the p-Loewy chain S(i;p). If the 
p-Loewy chain S(i; p) is non-trivial, then G/S contains elements of order p. 
Since G is a little extension of S, this implies the finiteness of K[p, @]. 
From the finiteness of K[p,G] we infer the finiteness of K[p, S], the 
finiteness of K[p, p'G]/K[p, p**G@] for every i and the vanishing of almost 


th 


| | 
| 
of 
wh 
in 
tha 
Sin 
tha 


EXTENSION TYPES OF ABELIAN GROUPS. 475 


every K[p, p'G]/K[p, p*'G]. The necessity of Condition (N) is now an 
immediate consequence of Corollary 1. 

We precede the proof of the sufficiency of Condition (N) by the proof 
of a more general existential proposition. 

LemMa 2. If S(1) is a p-Loewy chain of the group S, then there exists 
an extension P of S with the following properties: 


(a) S(i) p'P for every i; 

(b) K[p;P]/K[p:8] is isomorphic to the direct sum of the groups 
[S(i+ 2) 1) pS(i)]/pS(i + 1); 

(c) every element in P/S is of order a power of p. 


Proof. We show by complete induction with respect to 7 the existence 
of groups #(7) with the following properties: 


(1) S=E(0); 

(2) E(j) SE(j+1) for 

(3) piE(j) =S(j); 

(4) p‘E(j) NS=S(k) for 0<k <j; 
(5) K[p, #(1)]=K[p, 8]; 


(6) K[p, B(j + 1))/K[p, B(j)] ~ +1) pS(j—1)]/PS(j) 
for 0 <j. 


We note that S = H(0) meets all the relevant requirements. It will be 
convenient, however, to give a separate construction of (1). We note first 
that pS = pS(0) = 8(1). Hence 8(1)/pS is a direct sum of cyclic groups 
of order p; and consequently there exists some basis B of S(1) modulo pS. 
We obtain £(1) by adjoining to S elements 6’, for b in B, subject only to 
the relations: pb’ = b. 

Then £(1) =S + {B’} [where B’ indicates the set of all the 6’ for 
bin BJ]. Hence (1) = pS + {pB’} = pS + {B} = S(1). 

If x is any element in £(1), then e—s-+ Sc(b)b’ where s is in § and 
where almost every c(b) —0 and where the summation ranges over all the 
in B. Since pb’ =b belongs to S, we may assume without loss in generality 
that c(b) =0, if divisible by p. If 0 = pz, then 0 = pxr=ps + Dic(b)b. 
Since ps belongs to pS, and since B is a basis of S(1) modulo pS, it follows 
that every c(b) is divisible by p. Hence every c(b) =0; and so ~=s belongs 


| 
se 
ef 
| 

Ip | 
of 
of 
if, 
ud 
b) 
d. | 
a 
in 
of 
he 
he 
yst 


476 REINHOLD BAER. 


to S. This completes the proof of the fact that #(1) meets all the relevant 
requirements. 


p-Loewy chain, it follows from (3) that 
pS(t) S S(t +1) pS(t—1) S S(t +1) S S(t) = (i). 


This implies in particular that [S(i+1) pS(t—1)]/pS(t) and 
S(i+1)/[S(i+1) 1 pS(t—1)] are direct sums of cyclic groups of 
order p. Consequently there exists a basis A of S(i1-+ 1) modulo S(i +1) 
() pS(t—1) and a basis B of S(t+ 1) [| pS(i—1) modulo pS(i). Since 
A is part of piE (1), there exists to every element a in A an element a’ in 
E(t) such that pia’ =a. 

We obtain H(i+ 1) by adjoining to H(i) elements a”, for a in A, and 
elements b’, for b in B, subject solely to the relations: 


pa” =@ for a in A, p‘**b’ = b for b in B. 


We note first that F(i+ 1) = £(i) + {A”} + {B’} and that conse- 
quently 


prE(i +1) = pS(i) + {A} + {B} =[S(i+1) pS(i—1)] + {4} 
ao +1) 


since B is a basis of S(t+1) [ pS(t—1) modulo pS(i), A a basis of 
S(i+1) modulo S(t+1) | pS(i—1), and since S(t) = ptH(i) by (3). 
Every element x in E(i-+ 1) may be represented in the form: 


(r) tam + + 


where u belongs to H(i), where the first of these sums ranges over all the 4 
in A and the second sum over all the 6 in B, and where almost every c(a) 
and almost every c(b) is 0. Since pa” =a’ belongs to E(1), we may assume 
without loss in generality that c(a) 0 whenever c(a) is divisible by p. 
Since p*t?b’=b belongs to SS H(i), we may assume without loss in 
generality that c(b) 0 whenever c(b) is divisible by p***. We shall refer 
to representations (r) of 2 which meet all these requirements as to normalized 
representations. 

Suppose now that the element z in H(i-+ 1) is given in the normalized 
form (r), that 0 < ji and that px belongs to 8. Then 


pic = plu + pi*d c(a)a’ + pic(b)v’. 
a b 


Assume now that 0 <i and that we have constructed, for 0=j Si, [| 
groups E(j) which satisfy conditions (1) to (6). Since the S(%) forma | 


an 


ele 


| 
f 
t 
S 
[ 
b 
¢ 
t 

be 
H 
by 
Wwe 
th 
= 
pl 


at 


d 


of 


EXTENSION TYPES OF ABELIAN GROUPS. 4V7 

Since pix is in S, it is in H(7). Since w is in E(t), so is piu. Furthermore 

every a is in H(i). Consequently > pic(b)b’ belongs to H(i). It follows 
b 


from the construction of H(i+ 1) that every pic(b)b’ belongs to E(t) and 
that therefore pic(b) is divisible by p**t. Hence c(b) = p‘*!-id(b) where 
d(b) =0 whenever d(b) is divisible by p/. Consequently 


pic = piu + pi* c(a)a’ + d(b)b. 
a b 
Since p/a as well as all the b belong to S, and since w and all the a’ are in 
E(i), it follows that p/"[ pu + > c(a)a’] belongs to S [| pi*#(i) = S(j — 1) 
[by (4)]. Multiplying this element by we infer that + >i c(a)a 
a 


belongs to p*s**S(j—1) = pS(t—1), since the S(k) form a p-Loewy 
chain. But belongs to =pS(t) S pS(t—1) by (3); and 
thus we have shown that > c(a)a belongs to pS(t— 1). Since A is part of 
S(ti+1), Sc(a)a belongs to S(i+1) f) pS(ti—1); and since A forms a 


basis of S(i + 1) modulo S(i +1) [) pS(i—1), it follows that every c(a) 
is divisible by p. This implies c(a) = 0 because of our normalization of (r). 
Hence 
pic = piut > d(b)b. 


Since pix and every b are in S, it follows that piu belongs to S [| piE(t) = S(j) 
by (4). Since every belongs to S(i+1) =S(j) [because of j<it+1], 
we have shown that p/x belongs to S(j). Now one deduces from (4) easily 
the inequalities : 


S(j) =piE(t) NSS 
and this shows the validity of 
S(j) = piE(i +1) 18 forO<j<i+l. 


If b belongs to B, then b belongs to S(1) = p‘H(i). Hence there exist 
elements b” in H(i) such that p‘b”’—b. From pi*tb’=b we infer that 
p[ptb’ — p*-1b”] = 0; and thus the elements 


b* = pib’ — p*-'b” for b in B belong to K[p, E(t + 1)]. 


Suppose now that f(b), for b in B, are integers such that almost every 
f(b) =0 and Sf(b)b* belongs to K[p, Since every b” belongs to 
b 


F(i), this implies that } f(b)p‘b’ belongs to E(t). It follows from the 
b 


i, 
a 
d 
of 
) 
n 
a 
) 
e 
0 
d 


478 REINHOLD BAER. 


construction of L(i-+ 1) that every f(b) p‘b’ belongs to H(7) and that there- 
fore f(b) p* is divisible by p‘*t and f(b) is divisible by p. Since pb* =0, 
this implies } f(b)b* =0; and we have shown that 


the elements 0* for b in B are independent modulo K[p, E(i) ]. 


Consider now an element x in K[p, E(i+ 1)]; and suppose that it is 
given in the normalized form (r). Then 


0 = pr= put + 


Since pu and the elements a’ are in E (7), it follows that > pc(b)b’ belongs 
b 
to E(i); and we infer as usual that every pc(b) is divisible by pé*?. Hence 
c(b) = p'g(b) where g(b) —0 if divisible by p. Hence 
and multiplying by p? we find that 


The element p‘*!u belongs to pi**#(i) = pS(i) by (3); and the element p‘b 
belongs to p'[S(i +1) M pS(i—1)]S p[S(i+1) pS(i—1)] S pS(A) 
since 0 <i. Consequently } c(a)a belongs to pS(t) too. But A is a basis 
of 1) modulo S(i +1) pS(i—1); and this implies, a fortiori, that 
A is independent modulo pS(i) . Hence every c(a) =0; and now it follows 
that 
De(b)b S9(b) piv’ 
b b 


— ut + 2 — + 


Since wu and b” belong to H(i), so does v= g(b)pi*b”. Since and 
b 


b* belong to K[p,#(i+1)], we have pu=0. Hence v belongs to 
K[p, E(i)]. This completes the proof of the fact that 


the elements b* for 6 in B form a basis of K[p, H(i + 1)] modulo 
K[p, E(i)]. 


Thus we have shown that K[p, F(i+1)]/K[p, F(i)] and [S(i+1) 
{) pS(t—1)]/pS(i) possess bases containing the same number of elements. 
This implies the isomorphy of these groups, since they are both direct sums 
of cyclic groups of order p. 


| 

a 
t 

t 
t 
¢ 
t 
is 
all 
(f 


EXTENSION TYPES OF ABELIAN GROUPS. 479 


This completes the proof of the fact that the group H(+-+ 1) satisfies 
all the relevant conditions (1) to (6). Thus we have completed the inductive 
construction of the groups (7). 

Since the groups (7) form an ascending chain of groups, there exists 
one and only one group P which is the sum of the groups H(i). From (1) 
it follows that S = £(0) is part of P. If x is an element in P such that pix 
belongs to S, then x belongs to some E(k). Ift< k, then we infer from (4) 
that pix belongs to p‘H(k) (| S= S(t); and if k Si, then we infer from (3) 
that pix belongs to S = p**p*E(k) S = p**8(k) S S8(1), since 
the S(j) form a p-Loewy chain. Consequently it follows from (3) that 


S(i) = p'B(i) piPM SS S(i) or S(i) =piP NS. 


Noting that K[p;P] is the sum of the K[p;E(i)] one sees easily that 
K[p, P] is isomorphic to the direct sum of the K[p; Z(i+1)]/K[p, E(t) ]; 
and hence it follows from (5) and (6) that K[p, P]/K[p, 8] is isomorphic 
to the direct sum of the [S(i+ 1) pS(t—1)]/pS(t) for 0 <i. If finally 
zis an element in P, then x belongs to some E(n) ; and it follows from (3) 
that belongs to p"#(n) =S(n) Hence every element in P is 
modulo S of order a power of p. Thus P meets all the requirements; and 
this completes the proof of Lemma 2. 


Proof of the sufficiency of Condition (N). Assume that the p-Loewy 
chains S(t; p) satisfy Condition (N). If S(i;p), for some prime: p, is a 
trivial p-Loewy chain of S, then we let S=P,. If the p-Loewy chain S(1; p) 
is, for some prime p, not trivial, then we denote by Pp, some extension of S 
with the properties : 


(d) S(t; p) =S 1) 

(e) K[p, Pp]/K[p,S] is isomorphic to the direct sum of the groups 
[S(t+ 23 p) 1) pS(is p)]/pS(i+ 1p); 

(f) every element in P,/S is of order a power of p. 

It follows from (e) and Conditions (N.a) to (N.c) that 

(g) K[p, Pp] is finite. 

The existence of P, is a consequence of Lemma 2. 


There exists one and essentially only one extension G of S§ which contains 
all these groups P, as subgroups and which is their sum. One deduces from 
(f) and (g) that @ is a little extension of S; and one deduces from (d) and 


e- 
0, 
is 
gs 
ce 
ib 
1) 
318 
at 
nd 
to 
lo 
1) 

ns 


480 REINHOLD BAER. 


(f) that S(t; p) piG@ for every prime power Thus meets all 
our requirements; and this completes the proof of the Existence Theorem. 
We offer two important special cases of the Existence Theorem. 


A. Suppose that S contains only a finite number of elements of any 
given order; and that S(i;p) ts, for every prime number p, a p-Loewy chain 
of 8S. Then there exists a little extension G of 8 such that G () ptS = S(i;p) 
for every prime power pi. 


To deduce this from the Existence Theorem one has to note mainly 
that S does not contain elements of order 0 and that the descending chain 
condition is satisfied by the subgroups of every given primary component 
of S. We omit the further details of the derivation. 


B. Suppose that 8 ts a finite group; and that S(i;p) is, for every 
prime number p, a p-Loewy chain of 8. Then there exists a finite extension 
G of S such that =S(i; p) for every prime power if, and only jf, 


(B’) there exists to every prime p an integer m=m(p) such that 
S(m(p);p) =0 and 


(B”’) there exists only a finite number of primes p with non-trivial 
p-Loewy chain S(t; p). 


We omit the proof. 


3. The partially ordered set of extension types. The following gen- 
eralization of the concept of equivalence is used for the definition of this 
structure. 


DEFINITION 1. If Gand H are extensions of the group S, and if there 
exists a homomorphism of G into H which leaves invariant every element in 8, | 
then Ext ([S<G]SExt(S<H]. If both Ext[S < H] S Ext [S < G] 
and Ext [S < G] S Ext [8 < H], then Ext [S < H] = Ext [S < G]. 


We express this in words by saying that the extension type of H over 8 
is smaller than or equal to the extension type of G over S. It is clear now 
that the extension types over S form a partially ordered set. We shall restrict 
ourselves to a consideration of the little extensions of S; and their extension 
types form a partially ordered set too which we denote by A(S), the partially 
ordered set of the types of little extensions over S. 

To obtain an internal characterization of A(S) we define another pat- 
tially ordered set. If L(p) is, for every prime number p, a definite p-Loewy 
chain of the group S, then we term 


EXTENSION TYPES OF ABELIAN GROUPS. 


a Loewy system of S whose p-th coordinate is the p-Loewy chain L(p). If 
the p-Loewy chain L(p) consists of the subgroups S(i;p) of S and the 
p-Loewy chain L’(p) consists of the subgroups 8’(i; p) of S, and if S(i; p) 
<S’(i;p), then we let L(p)SL’(p). If and 
I'’=[---L’(p):--] are Loewy systems of § such that L(p)SL’(p) for 
every prime number p, then we let L= UL’. It is clear that in this fashion 
the system of all the Loewy systems of S has been made into a partially 
ordered set. This partially ordered set is, in general, somewhat too big. 
Hence we define. 


DEFINITION 2. The Loewy system L=[---L(p):--] is a little 
Loewy system, if it satisfies the following condition: 


(N) Jf the p-Loewy chain L(p) consisting of the subgroups L(t; p) 
of S is non-trivial, then 


(a) S contains only a finite number of elements of order p; 

(b) [pL(isp) 1 p)]/pL(i+1;p) is finite for every i; and 

(c) [pL(isp) =0 for almost every i. 

The partially ordered set of the little Loewy systems in § we denote 
by L(S). 


Now we may map the little extension G of S upon the Loewy system 
whose p-th coordinate is the p-Loewy chain Sf] p'G. It is a consequence 
of the Existence Theorem of Section 2 that this Loewy system is a little Loewy 
system; and one deduces readily from 1, Proposition 1 and the Existence 
Theorem of Section 2 that this mapping induces a [natural] isomorphism 
of the partially ordered set A(S) upon the partially ordered set L(S). We 
restate this result as follows. 


CHARACTERIZATION THEOREM. The partially ordered sets A(S) and 
L(8) are essentially the same. 


The division into extension types and the mapping of A(S) upon L(S) 
may be brought into sharper focus by the following results. 


THEOREM 1. The following properties of the little extensions G and H 
of S are equivalent. 


(i) Ext [S < G] = Ext [8 < H]. 


481 
ny 
uin 
p) 
ly 
in 
nt 
on 
(| 
at 
ial 
n- 
is 
re 
8, 
] 
ct 


REINHOLD BAER. 


(ii) SM piG=S) p'H for every prime power pi. 


(iii) If G=G@’ @ G’ and H =H’ @ H” are direct decompositions such 
that G’ and H’ are minimal extensions of S, then G’ and H’ are equivalent 
extensions of S 


(iv) There exist direct decompositions G=A@®B and H=C OD 
such that A and C are equivalent extensions of 8. 


This is an almost immediate consequence of 1, Proposition 2 and 1, 
Proposition 4 and 1, Corollary 3. 

Combining Theorem 1 and the E quivalence Criterion of Section 1 we find 
that the little extensions G and H of S belong to the same equivalence class 
if, and only if, G=H and Ext [S < G@] =Ext[S<H]. Thus the equi- 
valence classes are determined within the extension type by the structure of 
the extended group. 


THEOREM 2. The following properties of the little extensions G and H 
of S are equivalent. 
(i) H]. 
(ii) SM p'H for every prime power pi. 
(iii) There exists a little extension J of G [and of S] such that 
Ext [S < J] = Ext [S < H]. 


Proof. The equivalence of conditions (i) and (ii) is an immediate con- 
sequence of 1, Proposition 1. If (iii) is true, then there exists a homo- 
morphism y of J into H which leaves invariant every element in S; and 
hence it follows from GJ that 


SN pGSS8 = (SN SS) for every prime power 


and thus (ii) is a consequence of (iii). 
Assume finally the validity of (ii). It is a consequence of 1, Proposition 
4 that H=H’@H” where H’ is a minimal extension of S. Clearly 


extension of G which belongs to the same extension type of S as H, we may 
assume now without loss in generality that H itself is a minimal extension 


of 8. 
We form now a group K with the following properties: 


482 

Ext [S < H] = Ext [8 < H’]. Since we intend to show the existence of an [ 
| 
a 


EXTENSION TYPES OF ABELIAN GROUPS. 483 


One deduces from (a) readily the validity of the following properties. 
(a’) Every element in K has the form g +h with g in G and h in H. 


(a”) g+h [with g in G and h in H] belongs to S if, and only if, 
both g and h belong to S. 


Next we show 
(b) Sf) ptkK =S) p'H for every prime power pi. 


It is clear that S{) p'H = Sf) pik. If conversely x is an element in 
S{) pik, then there exist elements g and h in G and H respectively such 
that r= pi(g +h) =p'g+ pth. Since pig is in G, pth in H and z in 8, 
we infer from (a”) that pig and pith both belong to S. Hence p‘h belongs 
toS and it follows from (ii) that p‘g belongs to S S S () piH. 
Consequently x is in S {) p'H; and this completes the proof of (b). 


(c) There exists a homomorphism x of K into H which leaves invariant 
every element in S. 


Since K/S = (G/S) @® (H/S8), and since none of the elements in G/S 
and H7/S is of order 0, none of the elements in K/S is of order 0. If K/S 
contains an element of prime number order p, then G/S or H/S contains an 
element of order p. If H/S contains an element of order p, then the little 
extension H of S contains only a finite number of elements of order p. It 
((/S contains an element of order p, then the little extension G of S contains 
only a finite number of elements of order p. Hence S likewise contains only 
a finite number of elements of order p. This implies that H contains only a 
finite number of elements of order p, since otherwise S would contain all the 
infinitely many elements of order p in H. Thus we have shown that H con- 
tains only a finite number of elements of order p whenever K/S ‘contains 
elements of order p. Consequently we may apply 1, Proposition 1; and we 
infer from (b) the existence of a homomorphism « of K into H which leaves 
invariant every element in S. 


(d) H is a direct summand of K. 


By (c) there exists a homomorphism x of K into H which leaves 
invariant every element in S. Since H=K, x is an endomorphism of K 
and consequently « induces an endomorphism of H which leaves invariant 
every element in S. Since H is a minimal extension of S, it follows from 1, 
Corollary 3, (b) that « induces an automorphism in H. If W is the kernel 
of the endomorphisms « of K, then it is clear that W {] H =0, as « induces 


Be 


484 REINHOLD BAER. 


an automorphism in H. If x is an element in K, then 2« belongs to H. 
But H = Hx; and consequently there exists an element y in H such that 
yx= axe. Clearly x—y belongs to W; and thus we have shown that 
K =H @ W;; and this proves (d). 

Let K =H @N by (d). Then N K/H =(G+ H)/H = G/(G 
=G/S. Thus N and K contain an infinity of elements of order p whenever 
G/S contains an infinity of elements of order p; and hence K will, in general, 
not be a little extension of S. Thus K will not be the desired extension J 
in spite of the fact that K and H belong by (c) to the same extension type 
of 8. 

There exists a maximal subgroup M of N such that Mf] G=0. This 
follows from the customary application of set theoretical principles [ Maxi- 
mum Principle]. We prove next 


(e) Every element in N/M is of order different from 0, and to every 
prime p there exists only a finite number of elements of order p in N/M. 


Since N and G/S are isomorphic, neither N nor N/M can contain 
elements of order 0. Suppose now that p is a prime number such that N/M 
contains an infinity of elements of order p. Then NW contains elements of 
order p. Hence G/S contains elements of order p. Hence the little extension 
G of S contains only a finite number of elements of order p. Conse- 
quently K[p,G] and (M+ K[p, @])/M are finite groups. Since therefore 
[VM (M+ K[p, @])]/M is a finite subgroup of the group K[p, N/M] 
which is by hypothesis infinite, there exists an element M + w in K[p, N/M] 
which does not belong to [V (M+ K[p, @])]/M. Form M’ =M + {vu}. 
Then M < M’=<N. It follows from the maximality of M that M’ (1) G0. 
Consequently there exists an element m in M and an integer 7 such that 
0~m-+iw belongs to G. Since M+ w is in K[p,N/M], pw is in M. 
Since M () G=0, it follows that 7 is prime to p. Since pw is in M, so is 
pm+ipw. But this element belongs to G too and M{)G=0. Hence 
p(m + iw) =0; and m+ iw belongs to K[p,G@]. Consequently M + w is 
an element in [NV () (M+ K[p,G@])]/M. This contradicts our choice of 
M + w; and thus we have been led to a contradiction which proves that there 
exists only a finite number of elements of order p in N/M. 

We let J = K/M. From M =0 it-follows that G and (G+ M)/M 
are essentially the same; and thus we may consider J as an extension of G. 
Since K/G—~H/S does not contain elements of order 0, neither does 
J/G=K/(G+M). If J/G contains elements of order p, so does K/G = H/S. 
Hence H contains only a finite number of elements of order p. But 


mh ~ oe 


e 

a 

0 

T 

( 
is 

th 
Ww 
E 
A 

Ex 

wh 
A( 
ser 
Ex 

is } 

to « 


EXTENSION TYPES OF ABELIAN GROUPS. 485 


J=(H@®@N)/M =H © (N/M); and it follows from (e) that J contains 
only a finite number of elements of order p. Thus we have shown that J is 
a little extension of G [where we have identified as usual (@ + M)/M and GJ. 
From J ~H @ (N/M) we infer that Ext [S <J]—Ext[S < H]; and 
thus we have shown that J meets all the requirements. Hence (iii) is a con- 
sequence of (ii) ; and this completes the proof. 


Remark 1. If G is a group between S and the extension H of S, then 
clearly Ext [S << G] = Ext [S < H]. The question arises whether one may 
conversely infer from Ext [S < G] = Ext [S < H] the existence of a group 
B between S and H such that Ext [S < G] = Ext [S < B]. The following 
example shows that this is not always true. Let G= {u} @ {v} where 
and v are elements of order p and p® respectively; and let S = {u-+ pv}. 
Then G is readily seen to be a minimal extension of the cyclic group S of 
order p?. Let H be a cyclic group of order p* which contains § as a subgroup. 
Then H = {w} where pw=wu- pv. There exists one and only one homo- 
morphism of G into H such that v7 =—(1—p)w. Then 
(u-+ pv)y = pw=u-+t pv so that 7 leaves invariant every element in S. 
Hence Ext [S < G] < Ext [S < H], since equality of these extension types 
is clearly impossible. But G and H are both minimal extensions of S; and 
thus it is impossible [by 1, Corollary 3] to find a group between S and H 
whose extension type is that of G. 


Remark 2. The system of extension types of a little extension. 


Suppose that the group 7’ is a little extension of the group S. Then 
Ext [S <7] is a well determined element in A(S); and we denote by 
A(S;T) the system of all the extension types @ in A(S) such that 
Ext [S<T] 

If G is a little extension of 7, then G is likewise a little extension of § 
which satisfies Ext [S < 7] S Ext [S < @], so that Ext [S < G@] belongs to 
A(S;T). If @ and H are little extensions of T such that Ext [T < G] 
=Ext[7<H], then one verifies easily the equality Ext [S < 4] 
= Ext [S < H]. Consequently we obtain a single valued and order pre- 
serving mapping of A(7’) into A(S;7), if we map Ext[T <G@] upon 
Ext [S < @]; and it follows from Theorem 2 that in this fashion A(T) 
is mapped upon the whole system A(S; 7). . 

Simple examples show that this mapping will in general not be a one 
to one correspondence. 


Remark 3. The existence of common little extensions. 


486 REINHOLD BAER. 


The system A(S) contains one and only one maximal element; it is 
represented by a little extension G of S which satisfies G— Gp for every 
prime p such that S contains only a finite number of elements of order p 
[or equivalent: it is characterized by a little Loewy system S(i; p) = p‘S in 
case S contains an infinity of elements of order p and S(i;p) =S in case § 
contains only a finite number of elements of order p]. 

Suppose now that G and H are little extensions of S. From the pre- 
ceding remark we infer the existence of extension types ® in A(S) such 
that Ext [S< G] and Ext[S<H|]=®. We infer from Theorem ? 
the existence of little extensions J and / of S with the following properties: 
Ext[S We infer from 
Theorem 1 the existence of direct decompositions J = J’@J” and L = L’ @L” 
such that J’ and L’ are equivalent extensions of S. This implies in particular 
Ext [S < J’] = Ext [S < =. Consequently there exists a group 
Z=U @®V OW where U and J’ [and L’] are equivalent extensions of S, 
V=J",W=UL”. It is clear that Z is a little extension of S, since J and L 
are little extensions of 8. Furthermore it is easy to find a subgroup G, of 
U @ V such that G and G, are equivalent extensions of S; and a subgroup 
H, of U@®W such that H and Hy are equivalent extensions of S. Let 
Zo=Go+ Ho. This is a little extension of S such that Ext [S < Z,] <4; 
and Z, is essentially a sum of the extensions G and H of S. 

In general it will not be possible just to form G+ H with the require- 
ment S=G[{) H, since this group need not be a little extension of 8S. 

It should furthermore be noted that the preceding remarks do not settle 
the [still open] question whether there exists always in A(S) a least upper 
bound of any two extension types. 


4, Extensions of little groups. So far we have not imposed any 
restrictions upon the groups S whose extensions were investigated. All the 
restrictions were imposed upon the extensions themselves. In order that the 
following extensions of our theory may be attained as simply as possible we 
restrict now the class of extended groups. 


DEFINITION 1. The group S is a little group, if every Loewy system 
of S is a little Loewy system of S. 


We mention first that free abelian groups of positive rank, in particular 
the infinite cyclic groups, are not little groups. If § is such a free abelian 
group, not 0, then let, for instance, S = S(0, p), S(3i — 2, p) = 8(3i — 1, p) 
= §(3i,”) = p'S for 0 <1 and every prime number p. It is easy to see 
that these groups S(j,p) form a Loewy system of S, but not a little one; 
and hence S is not a little group. 


wet ~ ~ 


EXTENSION TYPES OF ABELIAN GROUPS. 487 


A large class of little groups, though not the largest one, has been 
mentioned in Section 2 [Criterion A]. In particular every finite group is 
a little group. 


Example. Let S be the cyclic group of order p, generated by the 
element s. Denote by G the group obtained by adjoining to S elements g(t) 
for i= 1,2,: - -, subject to the relations pig(i) =s for every 7. It is easy 
to see that 
(L) S=S[) q/G for every prime power q/. 


The minimal extension of the little group S which is characterized by the 
Loewy system (L) is a group of type p® which contains § as its uniquely 
determined subgroup of order p. But there does not exist a homomorphism 
of a group of type p® into G which leaves invariant the elements in S. Con- 
sequently there does not exist any little extension of the little group S which 
belongs to the same extension type as G. 

Neither subgroups nor quotient groups of little groups need be little. 
For instance, the additive group of rational numbers is a little group; but 
its subgroup of integers is not a little group, as has been pointed out before. 
If furthermore S is the direct sum of a cyclic group Z of order p and of a 
countable infinity of groups R(7) each of which is isomorphic to the additive 
group of rational numbers, then S is a little group. If J(t) is the subgroup 
of the integers in R(i), and if J is the direct sum of the subgroups J(1), 
then S/J contains an infinity of elements of order p; and now it is easily 
verified that the quotient group S/J of the little group S is not a little group. 


Proposition 1. Jf S is a little group, then A(S) is a complete modular 
lattice. 


To prove this, it suffices to realize that A(S) is, by the Characterization 
Theorem of Section 3, essentially the same as the partially ordered set L(S) 
of the little Loewy systems of S; and that Z(S) is the system of all the 
Loewy systems of S, since § is a little group. The remainder of the argu- 
ment is fairly obvious. 

If § is a little group, then A(S) need not satisfy any of the chain 
conditions. If S happens to be finite, then the descending chain condition 
holds in A(S), though the ascending chain condition will, in general, not 
be satisfied by A(S). 

PRoPosITION 2. Suppose that the subgroup R of the little group S has 
the property 


(F) If 8 contains only a finite number of elements of order p, then S/R 
contains only a finite number of elements of order p. 


488 REINHOLD BAER. 


Then S/R is a little group; and there exists a single valued and order 
preserving mapping of A(S) upon A(S/R) which induces an isomorphism in 
a suitable subsystem of A(S). 


Proof. Suppose that T(i,p) is a Loewy system of S/R. Then there 
exists to every prime power p‘ a uniquely determined subgroup S(1, p) of S 
which contains FR and satisfies T(i,p) —S(i,p)/R. Then pT (1, p) 
= [pS(i, p) + R]/R; and now it is easy to see that the subgroups S(i, p) 
form a Loewy system of 8. Since S is a little group, the S(1, p) form a little 
Loewy system. Assume now that, for some prime p, the p-Loewy chain 
T(1t,p) is non-trivial. Then the p-Loewy chain S(i,p) cannot be trivial 
either. Since the S(t,p) form a little Loewy system, this implies that § 
contains only a finite number of elements of order p; and it follows from 
Condition (FY that S/R contains only a finite number of elements of order p. 
Next we infer from the Isomorphism Laws and from Dedekind’s Law that 


[T(i +2, p) pT (i, p)]/pT (i +1, p) 
= ((S(i+ 2, p)/R] [(pS(i, p) + + 1, p) + 
= [S(i+2,p) 1 (pS(i, p) + B)I/[pS(i+ 1, p) + B] 
= (R+ [S(i+2,p) pS(i, p)])/[R + pS(i+ 1, p)] 
= [S(t + 2, p) M pS(i, p)]/[S(i + 2, p) p) + pS + 1, p))] 
= [S(i + 2, p) M pS(i, p)]/[pS(i+ 1, p) + (S(i+ 2, p) p) 


and this last group is clearly a homomorphic image of 


[S(i+ 2, p) 1 pS(i, p)]/pS(i +1, p). 
Since the S(i,p) form a little Loewy system, it follows now that, for every 
fixed prime p, [T(i+ 2,p) pT (i, p)]/pT(i+1, p) is finite for every 4 
and 0 for almost every 1; and thus we have shown that the T(i,p) form a 
little Loewy system. Hence S/R is a little group. 

It follows from the Characterization Theorem of Section 3 that A(S) 
is essentially the same as L(S) and that A(S/R) is essentially the same as 
L(S/R). Since § and S/R are little groups, L(S) and L(S/R) are the 
systems of all Loewy systems in S and S/R respectively. 

Denote now by Lr(S) the subset of all the Loewy systems S(i,p) in 8 
which satisfy: RS S(i,p) for every prime power p’. 

If X(i, p) is some Loewy system in S, then X*(i, p) =[R-+ X(i, p)]/R 
is a Loewy system in S/R; and it is clear that the mapping of XY upon X* 
constitutes an order preserving and single valued mapping of Z(S) upon 
L(S/R) which induces an isomorphism of Lr(S) upon L(S/R). This 
completes the proof. 


Thr 


me tet 


Coo 


EXTENSION TYPES OF ABELIAN GROUPS. 489 


Appendix I. Generalizations. The possibilities of immediate generali- 
zation of the present theory in its entirety seem to be rather meagre. The 
following extensions of the theory can be effected with hardly any effort; and 
we enumerate them here because of their possible value in applications. 

1. Extension of the concept of little extension. If we consider instead 
of the little extensions defined in Section 1 extensions G of § with the 
property : 

(D) there exists a direct decomposition G = L @ K where L is a little 
extension of § [in the sense of 1, Definition 1]; 


then it is easy to check that all the results carry over easily from little 


extensions to extensions with Property (D). 

Condition (D) may be applied in the following fashion: Denote by P 
the subgroup of all the elements in the extension G of S of which a positive 
multiple belongs to S. If P is a little extension of S, and if G/P is a free 
abelian group, then 7 is a direct summand of @ and (D) is satisfied. This 
situation prevails, for instance, in case S and G/S are finitely generated 
groups. 


2. Subgroups with division. Suppose that there is given an extension 
@ of the group S, a homomorphism o of G@ into the group 7 and a subgroup 
K of H such that K = pK for every prime p. Then it is well known that K 
is a direct summand of JJ. If H=K OL, then where is a 
homomorphism of 8 into A’ and p is a homomorphism of § into ZL. It is 
easy to see that 7 is induced by a homomorphism of G@ into K; and thus a 
will be induced by a homomorphism of G into // whenever p is induced by a 
homomorphism of G into 1. This simple remark may. be used for generaliza- 
tions of 1, Proposition 1; but because of a certain asymmetry it does not lend 
itself too well for further use. 


3. Operator groups. If one imposes sufficiently strong conditions on 
the operator ring there is no difficulty in extending the whole theory. The 
following conditions would be obviously sufficient: the ring R of operators is 
commutative, its ideals are principal, the unique decomposition theorem holds 


in R, and P is finite modulo its prime ideals. 


Appendix II. Dualization. If A is a finite abelian group, then A is 
isomorphic to its character group; and this isomorphy provides us with a 
self-duality of A. This self-duality of A interchanges subgroups and quotient 


groups; and thus it may be used to dualize the results of the present inves- 


16 


or 
e 
S 
) 
) 
le 
n 


490 REINHOLD BAER. 


tigation as far as they refer to finite groups only. We shall state the results 
obtained by this process of dualization without going into the details of their 
proofs, noting only that the self-duality of A interchanges the subgroup nA 
and the subgroup K[n; A] of all the solutions of the equation nz=0 in A. 


I. Suppose that S and T are subgroups of the finite groups A and B 
respectively. 

(a) The homomorphism y of A/S into B/T is induced by a homo- 
morphism of A into B [which maps § into 7] if, and only if, 


+ K[p'; A])/Sly (7 + B])/T for every prime power pt. 


(b) The isomorphism « of A/S upon B/T is induced by an isomorphism 
of A upon B [which maps S upon 7] if, and only if, A ~ B and 


[(S + K[p*;A])/S]x = (T + K[p';B])/T for every prime power pi. 


If y is a homomorphism of the group G upon the group Gy = L, then 
the pair (G,y) constitutes an extension by L. The extensions (A,«) and 
(B, 8) by the same group L are equivalent extensions by L, if there exists 
an isomorphism o of A upon B such that 7« = roB for every x in A. 


II. Suppose that A and B are finite groups. Then the extensions (A, ) 
and (B, 8) by the same group ZL are equivalent extensions by LZ if, and only 
if, and K[p‘;A]a—K[p‘; for every prime power pi. 

Extension types by Z and their partial ordering may be discussed now 
in a similar fashion. We omit the details. 


UNIVERSITY OF ILLINOIS, 
Upsana, ILLINOIS. 


BIBLIOGRAPHY. 


R. Baer (1), “Splitting endomorphism,” Transactions of the American Mathematical 
Society, vol. 61 (1947), pp. 508-516. 

A. Kurosch (1), “ Kombinatorischer Aufbau der bikompakten topologischen Riume,” 
Compositio Mathematica, vol. 2 (1935,), pp. 471-476. 

(2) “Zur Theorie der teilweise geordneten Systeme von endlichen Mengen,” 

Recueil Mathématique, vol. 5 (1939), pp. 343-346. 

O. Ore (1), “Structures and group theory. I,” Duke Mathematical Journal, vol. 3 
(1937), pp. 149-174. 

N. Steenrod (1), “Universal homology groups,’ American Journal of Mathematics, 
vol. 58 (1936), pp. 661-701. 


the 
ag 
Its 
gan 
U, = 
free 
gene 
free 
is 
syste 
free 
Of 
certa 
see t 
of 
the ; 
§ is 
ot 
exam 
that 
| syste 
gene! 

1 
590-65 
2 


LIBRARY 


GENERALIZED FREE: PRODUCTS WITH AMALGAMATED 
SUBGROUPS.* 


By Hanna NEUMANN. 


PART II. The Subgroups of Generalized Free Products. 


Introduction. In Part I we defined the generalized free product © of 
the groups G, with amalgamated subgroups Ugg. We found that & contains 
a group Ul which is itself a free product with amalgamated subgroups Ugg. 
Its factors are the subgroups U, of ©, which are generated by all the amal- 
gamated subgroups ligg of Ga. If, in particular, all Uag are identical, then 
Wa = Uag for all B, all the groups U, are isomorphic, and © is simply the 
free product of the groups ®, with the one amalgamated subgroup Ul. In the 
general case the subgroup U of ®& may be loosely described as the smallest 
generalized free product with amalgamated subgroups Ugg. 

We are going to show that every subgroup © of © is itself a generalized 
free product—unless § belongs to U, in which case nothing can be said. 


A. Kurosch [1] ? has shown that every subgroup $(3%) of a free product 
§ is itself a free product. The proof consists in the construction of a suitable 
system of generators for $(%). In Part I (1) we obtained the generalized 
free product & of the groups Gq as a homomorphic image of the free product 
§ of these groups ®,. Every subgroup § of & will, therefore, be the map of a 
certain subgroup $(%) of %. Using Kurosch’s result, it is not difficult to 
see that the homomorphism which maps % onto @, will map the free factors 
of (%) onto groups generating § in such a way that certain subgroups of 
the free factors of $(%) are amalgamated. The difficulty is to show that 
§ is actually the free product of these factors with amalgamated subgroups, 
“ot only a homomorphic image of it. This difficulty necessitated a closer 
examination of the generators of §(%) as used by Kurosch and the relations 
that may arise between their maps. Thus I was led to the construction of a 
system of generators for the subgroup © of G, in principle similar to Kurosch’s 
generators, but selected with greater care to meet our particular needs. 


* Received August 27, 1947. 
1Part I. Definitions and general properties. This JOURNAL, vol. 70 (1948), pp. 


590-625. Results will be quoted thus: I. 5.4, meaning Theorem 5.4 of Part I. 
2 Numbers in square brackets refer to the list of references at the end of this paper. 


491 


492 HANNA NEUMANN. 


Throughout the paper, we assume to be a generalized free product, 
The (slight) simplifications in the case of a-free product with one amal- 
gamated subgroup are indicated briefly in square brackets. 

The main tool is the length of an element of @ as defined in I. 5.1, 
in particular, comparison of the length of a product with that of its factors. 
We begin, therefore, with a restatement of the fundamental properties con- 
cerning the representation and multiplication of elements of @ (1), followed 
by a number of useful facts and definitions in connection with the length of a 
product in terms of that of its factors (2). 

In 3 we define a system of generators for a given subgroup 6 of 6, 
The aim of the subsequent paragraphs is to obtain a comparison of the length 
of an element of § with the length of the generators involved in its represen- 
tation. This leads to necessary conditions for a word in our generators to 
be shorter than the longest generator which occurs in the word (7), and 
hence to a description of all possible relations between these generators (8). 
These relations, however, are not yet sufficiently simple. But the properties 
of the generators derived so far show the way to a further improvement in 
their original choice, with the result that the relations are considerably sim- 
plified (11, 12), and can now be interpreted so as to give the main result 


(13) : 


is a generalized free product with amalgamated subgroups. The factors 
are either subgroups of conjugates of the groups @,, or of WU (the latter 
become trivial in the case of a free product with one amalgamated subgroup): 
or they are of a type analogous to the free generators which occur in Kurosch’s 
representation of the subgroup of a free product. The amalgamated subgroups 
are generated by subgroups of conjugates of the groups Un. 

If @ is the free product of the groups G, with one amalgamated sub- 
group U, and if 11 is self-conjugate in G, (i.e. in every factor Ga), then 9 
is itself a free product with one amalgamated subgroup ¥. & is simply the 
meet of § and ll. This case includes in particular results by Kurosch and 
Kalaschnikov (cf. [2]), where 1 is assumed to be a subgroup of the centre 
of every factor Gq. 

In 14 we investigate to what extent those factors of § which are conjugates 
of subgroups of Ga, or of subgroups of 1, are uniquely determined by §&. 
The results are again a direct generalization of the corresponding results 
obtained by Kurosch [1]. 

We conclude with some remarks on the special part played by the sub- 
group ll of @ in the general case, indicating the reason why similar results 


ca 
pr 


an 
an 
of 


act 
me 


gel 
cor 


fort 


gr 
ha 
by 
is 
of 
1. ( 
Wi 
un 
fac 
fac 
G. 
Tep 
sho 
to ¢ 
ave 
He 


| 


| 


GENERALIZED FREE PRODUCTS. 493 


cannot be expected for the subgroups of this “smallest” generalized free 
product U. 


* 
1, Let = Ga: Wag = be the generalized free product of the 


groups G, (finite or infinite in number) with amalgamated subgroups Ugg. 
I.e. G is generated by the groups Gq, and any two of them, G, and Gg, 
have meet Uag = Ug, in &. As before, we denote by MU, the group generated 
by all the subgroups Ug (« fixed) of Ga, and by UU the subgroup of G which 
is generated by all the groups U4. Then 1 is the generalized free product 
of the groups Ug, 


and the meet of Il and Gq is Ua, for every a. In fact, it follows from I. 7.0 
and I. 6.02 that U and @, generate in G a group which is the free product 
of U and G, with the amalgamated subgroup Ug. 

We assume throughout the paper that for at least one suffix a, Gq is 
actually larger than 11,; hence @ is actually larger than Ul. This restriction 
means that no results will be obtained on the subgroups of the smallest 
generalized free product U. We touch upon this fact once more in the 
concluding paragraph. 

Every element G of G is of the form 


1.0 G Ga, U1, 1>0, 


with VU, in U (A= 0,1,- - -,1), Ga, in Gy (A—1,- - -, 2), and if 
41, then Uy in (A—1,:--,1). This representation is not 
unique: The “factors” Ga, are determined but for left- and right-hand 
factors in U,,, and the “1l-factors” U) are determined but for left-hand 
factors in U,, (for A=1,---,7) and right-hand factors in Ug),, (for 
A=0,---+,/—1). But the number /—/(G) of factors is an invariant of 
G. We call it again the length of G. Obviously, the formal inverse of the 
representation 1.0 is itself of the form 1.0; hence we have 1(G') =1(G). 

The elements of length zero are exactly the elements of U. But it 
should be noted that the elements of length one do not necessarily belong 
to a factor G,; they do if, and only if, both U, and U, belong to Ug. 

[If G is the free product with one amalgamated subgroup, WU belongs to 
every factor and 1.0 simplifies to Ga, 
Here G belongs to a factor @, if, and only if, 1(@) = 1.] 

The representation 1.0 can be made unique, e.g. if we use the normal 
form J. 5.1. But the use of some such normal form will, in general, not be 


Ct. 
Ors, 
On- 
ved 
fa 
en- a 
to 
nd 
8). 
ies 
in 
ilt 
ors 
) 
h’s 
ps 
ib- 
he 
nd 
re 
ts 
b- 
ts 


494 HANNA NEUMANN. 


very helpful in this part. What matters is the way in which two such 
elements are multiplied, in particular the length of the product in relation 
to that of the factors. 
Let 
G =U - 
and 
H = - - Hg,,Vm 


be two elements of length / and m respectively. If one of them, G say, is of 
length zero, we have 1(GH) =1(HG@) =1(H). If both elements are of 
length = 1, we have 


If either «£1, or a, —£,, but in that case U,V, does not belong to 
Ua, = Ug,, then this is a representation of the form 1.0 for the product GH; 
hence 1(GH) =1(G) +1(H). If, however, and =U, 
belongs to Ug, then Ga,(UiV.)Hg, is an element of Gg. Either it belongs 
to — U,; in that case we say that G.,U; and ‘amalgamate modulo 
Uy or briefly, that they amalgamate. Or, the product Ga,(UiVo) Hg, belongs 
to U, i.e. to Uz. Then we say, Ga,U, and VoHg, ‘ cancel modulo U,’ or simply, 
they cancel. In that case, further amalgamation or cancellation may take 
place, if also Ga,, and Hg, belong to the same group, Gg say, and the product 
of the U-factors between them belongs to the corresponding subgroup Us 
of U. Otherwise, no further amalgamation, or cancellation, is possible. 

Thus, a representation 1.0 of the product GH is obtained after a number 
r (0OSr=min (l,m)) of cancellations followed, possibly, by one amal- 
gamation. Every cancellation disposes of two of the /-+ m factors of G@ and 
H, the amalgamation replaces two factors by one. Hence: 


1.10 =1(G) +1(H) —2r—«, 


where «= 0 if no amalgamation takes place, «1 otherwise. In either case, 
if in the product GH there are exactly s factors at the beginning of G which 
are not touched by cancellation and amalgamation, then 1(G@) =s+r+.6¢ 
hence 


1.11 \(GH) =1(H) +s—r, 
and therefore 
1.12 1(GH) 21(H) according as s =r. 


An analogous formula compares the length of the product with that of the 
first factor G. 


[ 
0 

| 
l 
2 
0 
9 
l 
a 
fi 
fe 
Ci 
St 
a 
a 
is 


such 
ition 


is of 


GENERALIZED FREE PRODUCTS. 495 


That in the product GH r factors cancel, means that 
and for every p (l1=p<7r) we have 
1.210 (Garp.Ut-pu* Ga,U1) = (Ug, in Ug,) ; 
and if one further factor amalgamates, then also #_- = Bri, and 


[In the case of a free product with one amalgamated subgroup, cancellation 
or amalgamation will take place in the product GH if, and only if, a: = B,. 
The essential formulae 1.10, 1.11, and 1.12 remain, of course, the same. | 

2. Let G and H be two elements of length / and m respectively, where 
I= m. Throughout the following paragraphs we shall be particularly 
interested in the case that 


2.01 <1(H), 
or 
2.02 <1(H). 


Both inequalities are always true, if 1(G) 0; in that case, of course, 
=1(HG") =1(H). If l=1(G) =1, they imply, by 1.12, that 
at least as many factors of (-! cancel as remain untouched, i.e. at least [$1] 
factors cancel, and, if J is odd, at least one more factor cancels or amal- 
gamates. If, in 2.01 or 2.02, we are to have equality, then exactly [41] 
factors of G-1 must cancel, and in the case of even J nothing amalgamates, 
while in the case of odd 7 one more factor of G-' amalgamates, but does not 
cancel. 

Now the formal inverse of a representation 1.0 of G gives us a repre- 
sentation 1.0 of G. Hence, if again , 


G = and H = - 
and if in the product G-'H r factors cancel, we have by 1. 21, 

(Ga,** Ga, *Uo") = Up,(Up, in Ug,) ; 
and therefore, as 8, —«,, the representation 


H — U Ga, Ga,(Ug,V +) m 


is also of the form 1.0; i.e. representations 1.0 of G and H may be assumed 


2 of 
H; 

ngs 
lulo 
ngs 
ply, 
ake 

uct 

Us 
ber 
and 
Se, 
ich 

he 


496 HANNA NEUMANN. 


to be identical up to the r-th factor inclusive. If in the product G*H one 
further factor amalgamates, one concludes similarly from 1.22, that the 
first r factors of G and H, including the following U-factor, may be assumed 
to be identical, and the (7 -+ 1)-st factors of G and H belong to the same 
group @a,.,. 

If, in particular, G and H satisfy 2.01, then r= [41], i.e. at least the 
first [47] factors of G may be taken as identical to as many factors in the 
beginning of H, and, if 1 is odd, one more factor of G certainly belongs to 
the same group as the corresponding factor of H. And 2. 02 entails a similar 
statement for the last factors of G and H respectively. This leads to the 
following way of writing a representation 1.0 of an element @ of length 1: 

If 1 = 2k-+1 is odd, and 


we call S=S(G) = U,G,- - - G,U;, the first half of G, 
T =T(G) = the second half of G, 


C =C(G) = Gx, the central factor of G; 
if 1 = 2k is even, and 


we call S=S(G) =U ,G,- - - Ux-.Gy the first half of G, 


T =T(G) = the second half of G, 
C = C(G) = U;, the central factor of G. 


Then, in both cases, the representation 1.0 of G and G-! respectively is of 
the form G=SCT and G?=T"“*C"8S",? where S=S(G@) and T=T(G) 
are elements of length k= [41], and C—C(G) of length one or zero 
according as 1 is odd or even. [It should always be remembered, that if | 
is odd, S —S(G) includes the U-factor which follows the factor G;, while 
in the case of even J, S is the partial product of G including G;, but not U;. 
And correspondingly for JT. No such distinction occurs in the case of a free 
product with one amalgamated subgroup. | 

With this notation, 1(G"*H) =1(H) means that G can be written in 
such a way in the form G = SCT, that at least S is identical with as many 
factors in the beginning of H, and if / is odd, C belongs to the same group, 
as the next following factor of H. Similarly for T, if 1(HG") <1(H). 
In particular, 


8 Cf. Kurosch, [1], p. 650. 


[ 
° 
. 


1 one 
t the 
umed 
same 


t the 
n the 
gs to 
milar 
> the 
th 1: 


s of 
(G) 
zero 
if | 
hile 
Uy. 
free 


) in 
any 
H). 


GENERALIZED FREE PRODUCTS. 497 


2.03. If l(G) =1(7) =1, and 1(G"H) Sl, then G = 8(G)C(G)T(G), 
H=S8S(H)C(H)T (4A), where we may assume S(G) =S8(H); and C(G) and 
('(H) belong to the same group. Correspondingly, if l(HG-*) Sl, it may be 
assumed that T(G) =T(H), and C(G) and C(H) belong to the same group. 


If, under the same assumptions, 1(G-'H) =1, then if J is odd, C(G@)- 
and O(H) amalgamate, but do not cancel; if J is even, C(@)-? and C(H) do, 
of course, amalgamate, but the first factors of T7(G@) and T(H) are not even 
amalgamated in the product G*H = Similarly, 
if = 1. 

The following lemma, frequently used later on, is an immediate conse- 
quence of the preceding discussion: 


2.1. Let F, G, H be three elements of length k, l, m respectively, and 
kSls=m. Then 


2.11. 1(F°G@) =1(G) and S1(H) implies that also FH) 
S1(H); 


) 2.12. 1( FH) <1(H) and S<1(H) implies that also 1(F“@) 
<1(@). 


Also: 


2.2. Let again F, G, H be three elements of length k, l, m respectively, 
and kSlSm. Then 1(GF") =1(G) and 1(GF*H) S1(H) implies 


Proof. 2.2 follows from 2.11 in the following way: Put FG"! = @’; 
then 1(G’1) =1(G’) =1(G) =I. Moreover, we have =1(G*) 
=1(G’) and 1(@’"H) =1(GF“H) <1(H); hence, 2.11 applied to the 
elements F, G’, H gives 2. 2. 

To complete the preliminaries, we define: 


2.30. An element K such that 1(K*) S1(K), ts called a transform. 


We apply 2.03, with G— K-!, and H = K; then 1(G"H) S1(H) and 
1(H), hence K can be written in the form K =S(K)C(K)T(K) 
in such a way that 7(K) =S(K)*, ie. 


2.31. K=S(K)C(K)S(K)* is a transform of its central factor; 1. e. 
a transform of an element of a group ®aq, if 1(K) is odd, and a transform of 
an element of U, if 1(K) is even. 


498 HANNA NEUMANN. 


2.32. If K’ is another transform of the same length as K, and if 
S1(K), then K’ =8(K)C’S(K)*, where C’ =C(K’) belongs to 
the same group as C=C(K); and we have also l(KK’) S1(K). 


This follows immediately from 2. 03. 

If, in 2. 32, 1( KK’) < 1(K), then, in case 1(K) is odd, the central factors 
of K and kK’ must cancel. As they both belong to Ga, we have CC’ = U,, 
hence Kk’ = 8(K)U,S(K)-1.. This is again a transform, but of length 
=1(K)—1. The exact length depends on whether or not further factors 
of S(K) and S(K)- cancel or amalgamate. 

If 1(K) is even, we have K = S(K)US(K)*, K’ =8(K)U’S(K)~ and 
1(KK’) <1(K) implies that in the product KK’ = S(K)UU’-S(K)- at 
least one cancellation or amalgamation takes place between S(K) and S(K)7. 
If the last factor of S(K) belongs to the group Gg, UU’ must, therefore, be 
an element Ug of and KK’ = 8S(K)U,S(K)-. 

We define: 


An element of & which is conjugate to an element of some group Uk, 


is called a U-transform. 
Then we have seen: 


2.33. If K =SCS- and Kk’ = SC’S* are two transforms of the same 
length 1 such that I1(KK’) <l, then KK’ =8U,S~ is a U-transform. The 
suffix y is that of the group of the central factors of K and K’, if 1 is odd, 
and that of the group of the last factor of S, if 1 is even. 


Finally, we add a useful criterion for an element to be a transform: 


2.4. If G is any element of length 1, K an element of length S1, then 
(KG) S1 and (KG) SI can hold simultaneously if, and only if, K is a 


transform. 


Proof. If K is a transform, then with 1(K G) =I, certainly also 
(KG) <1, by 2.03 and 2.31. If on the other hand, both inequalities are 
satisfied, let K = S(K)C(K)T(K),1(K) =1,. As l(KG@) Sl, T(K)** may 
be assumed to be identical with the first [4/,] factors of S(@), and, if J, is 
odd, C(K) will belong to'the same group as the following factor of G. As 
also 1(K-*G) <1, S(K) may be taken as identical with the first [41,] factors 
of 8(G); i.e. S(K) and T(K)-* may be assumed to be identical in a repre- 
sentation 1.0 of K. Hence K is a transform. 

[A transform of length one need not be an element of a group Gz: it 
may be of the form UG,U- (U not in 11,). But in the case of a free product 


if 
s to 


tors 

U,, 
igth 
tors 


and 


GENERALIZED FREE PRODUCTS. 499 


with one amalgamated subgroup, a transform of length one is necessarily an 
element of a group ®,. Moreover, a transform can only be of odd length, 
i.e. it is always the transform of an element of a group @g, not of 11.] 


3. Let § be any subgroup of © which is not wholly contained in U; 
its meet with ll we denote by &. 

We choose a system of generators for § as follows: * by ¢o we denote the 
set of all elements V of %. Every element of ¢» is taken as a generator. 

If, for all ordinal numbers o’ < a, the sets do of generators are chosen, 
let Ro be the subgroup of § which is generated by all the generators of all 
the sets do (o’ <a) taken together. If Ro is not yet identical with §, 
we define the set ¢o of generators as follows: 


Let 1 be the length of a shortest element in §—S8vc. If amongst the 
elements of length / of 4 —8&o there exist some which are transforms, we 
choose one of those as first generators to of the set dc; and in addition, we 
take into do all other transforms t’c of 4 — So which are of the same length 
J and such that I(t’ctc) <1. By 2.32, all these transforms may be written 
with identical first and second halves, i. e. we start the set do with all elements 
of length 7 of § — Rc which are of the form 


3.01. to=SoCoCo", +, where all central factors 
belong to the same group. 


If outside the group generated by Ro and all the elements fo, t’c,- - - 
there are elements f of § of the same length 7 and with the property 
I(f-tte) 1, we choose such an element as next generator fo, of do. I.e. fo: 
is determined by 
3. 02 =1 and ts) Sl. 


If outside the group generated by Ro and the generators to, t’c,- - - and fo: 
there is still an element f in § which is of length / and such that 1(f“to) Sl, 
then we choose this as next generator fo. of ¢o, and so we go on. This 
process will reach its end at a certain ordinal number ro, when outside the 
group generated by Ro, to, t’c,- +,for (1 no more elements f 
of § exist which are of length J and such that /(fte) SJ. Then we denote 
by do the set of generators to, t’o,- and for and by Rou 
the subgroup of § which is generated by Ro and do. , 

If, on the other hand, there is no transform of length 7 in §—S8e, we 
choose any element of length / as first generator fc, of go, and then go on 


‘Cf. Kurosch, [1], pp. 651-652. 


, be 

Ua, 
The 

dd, 
hen 
sa 

lso 
are 
1ay 

is 

As 
ors 
re- 

it 
ict 


500 HANNA NEUMANN. 


as before; ¢o will then consist of generators fo,, (1S 7 <7,) only, and, for 


every 7, 
3. 03 l(for) =] and = a ea 

Ww 
Ro, is defined as before, as the group generated by Ro and go. sk: 


This process of generating © will reach its end at a certain ordinal | 
number when Then all the elements of all the sets 
(0 =o < o,) form the system ¢ of generators of §. T 

From the discussions in 2 it is seen that the generators of each set go | ¢p 
are all of the same length, have the same first halves, and, if of odd length, | 
their central factors at least amalgamate; i.e. all the generators of ¢o can rc 


be written as: al 
3.10 ta = = for SoCorT or. 
where all the central factors Co, C’c,: - -, and Co, belong to U, if / is even, i 


or to 2 — Ug, if J is odd. In particular we have 
3.11. U(gi*gx) for any two generators g; and gy of do. 


Generators of the same length need not necessarily belong to the same &' 


set do; but: if 
A 

3.12. If f ts an element of the same length 1 as the generators of do, 

0 

such that for some generator g of I(f-'g) Sl, then-f belongs to Ros. 

If, in addition, f is a transform, then f is itself a generator of the form to. i 


Proof. As I(f*g) Sl, it follows from 2.03 that f is of the form W 
f—SoCT where C belongs to the same group as the central factors of the 
generators of ¢c. Hence, for every generator g’ = ts or = fo; of do we have & 
l(f*g’) Sl. By definition of do, no elements of length J with this property 
can be found in §—8s,,._ Hence f belongs to Ro,1. 

If f is a transform, then f is of the form f= SoCSq-, and as it is of 
length 7, and all such elements were taken into ¢o, f = to. 


The process of choosing the generators defines an order relation between : 
them. If, in general, we denote unspecified generators, and their inverses, : 
by 9; 9i, 9x," °°, then g; is earlier than g; if in the process of choosing the : 


generators g; or its inverse was chosen before g; or its inverse. Unless both, 


gi and g;, are of the form 9; = te, gx = in which case no order relation 
is given for them, g; is either earlier or later than g;, whenever g; + 9;° | U 
(e=+1): 


| 
| 
| 
| 


for 


inal 


do 
oth, 
can 


yen, 


ume 


go, 


to. 


rm 
the 
ave 
rty 


of 


pen 
the 
th, 
ion 


| GENERALIZED FREE PRODUCTS. 501 


gi is earlier than g;, if gi* belongs to ¢o,, and gx belongs to ¢o,, where 
ai << ox. If oj =o%, ie. gi and belong to the same set ¢o, then gi is 
earlier than g, if either gi =for, or if gift —=for, and = for, 
where 74 < tx. If, in particular, g; and g; are of different length, then the 

shorter generator is earlier than the longer one. 

It will be noted that the rules governing the choice of this system ¢ of 
generators for § leave us, in general, considerable freedom in every step. 
This freedom of choice will be used later to impose a further condition on 

| the generators in order to simplify the possible relations between them. But 
'we have to learn more about the properties of these generators, before we 
can profitably choose between different sets obtainable by the process described 
above. 


4. This and the following paragraph deal with the immediate conse- 
quences of this choice of generators. They are very much the same as in 
the simpler case of a free product of groups where Kurosch [1] and Neumann 
[3] use similar systems of generators. 

Our principal aim is to derive a system of defining relations for these 


generators. As usual, we call P= Il Jp a word in the generators 9i,° * *, Yrs 


if no trivial cancellations are possible, i.e. if 9p for p—1,: -,r—1. 
A relation is a word R which is the unit element as element of §, i.e. it is 
of length zero. In order to find all possible relations, we shall investigate 
under what conditions a word P in generators gp of length 1(gp) S/ can be 
shorter than 1, i.e. shorter than the longest generator which occurs in the 
word. 

In this paragraph we deal with words in generators which are all of 
equal length J. We shall need the almost obvious lemma: 


4.0. If h is an element of length 1 in &, then h can be expressed by 
means of generators of length S1 only. 


Proof. Let o be the first ordinal number so that ¢o consists of generators 
of length 1, >1. Qc is, by definition, the subgroup of § which is generated 
by all generators earlier than the first generator of do, i.e. by all generators 
of length <1, and the minimum length of the elements of §—8e is h. 
As h is of length 1, h belongs to Re. 


4.1. If gi and gx are generators of equal length 1, and if g; ts earlier 
than gx, then 1(gigx) and 1(gxgi) = I. 


Proof. If e.g. 1(gigx) <1, and if gi belongs to ¢o, then, by 4.0, the 


502 HANNA NEUMANN. 


product gigx certainly belongs to Rc. Hence g, belongs to the group generated 
by Re and all generators of ¢o which are not later than gi, contrary to the 
assumption that g, is later than g. 

A product of two generators of the same length / can, therefore, be 
shorter than / only, if no order relation is defined for g; and g,, i.e. if they 
belong to the same set do and, unless g; = gi7', both are of the form g; = tz, 
gx = t's. And then 1(gigx) <1 can, in fact, happen; by 2. 33, their product 
is then a U-transform, gigx = SoU 

In all other cases, i.e. when one of the generators g; and gy is earlier than 
the other, we need to know when their product can be of minimal length 
L(gigx) =1(9i) =1(gx) =I. This certainly is the case, if gj and g, belong 
to the same set ¢o, and either gj gx=for, OF Ji =for,', Gx = for. 
In both cases, by 3.11, l(gigx) SI. But, by 4.1, the product is at least of 
length /, so that it follows now, that for any two generators g; = to, 9: =fo,r 
or gi=for,', =for, We have 1(gig,) We want to show 
that these are in fact the only cases; in all other cases we have /(gigx) >. 

In the next two lemmas, g; and g; denote generators of certain sets ¢o, 
and do,, but not their inverses, so that gi to, or gi =fo,,r, and corre- 
spondingly for gx. 


4.2.5 If 1(9i) =l(gx) =1, and 1(giggt) =1, then also =1 
and gi=to, gx =t's are two transforms belonging to the same set do. 


Corottary. If gi = for, and g is any other generator of the same length, 
then I(gig) > 1(gi) and 1(gig*) > 1(gi). 


Proof. Assume that one of the generators, g; say, is earlier than the 
other. 

If gi and g; belong to different sets, gi to $o,, gx to o,, then oi < ox. 
We put f=gigix. Then 1(f) =1, and f*g; 4g, is of length 1. Hence, 
by 3.12, f = gigi belongs to Ro,1. As also g; belongs to the group Ro,.1, 
it follows that also g, belongs to it, contrary to the assumption that 4% 
belongs to a later set than gj. 

Hence, gi and gx belong to the same set do. As gx is later than g; in ¢o, 
gx is of the form fo;,, and gi=to or —for,. Also f gigi? is again of 
length J (otherwise g; were not later than g;), and /(fg,) = 1; therefore, again 
by 3.12, f—?’s is itself a generator of 0 which is earlier than gx. But 
gx =fgi, contrary to the assumption that g,; is a generator later than gj. 

Hence, neither of the two is earlier than the other, i.e. gi = to, gx = t's. 


5 Cf. Kurosch [1], p. 653; Neumann [3], p. 14. 


( 
( 
t 
al 
t 
n 
i f 
a 
le 
of 
a 
pe 
j 
ar 
p 


ited 
the 


GENERALIZED FREE PRODUCTS. 503 


4, 3.° If (gi ==: gx) = and == L, then i= toe and K= t's 
or =fo,r belong to the same set do. 


Proof. Let us assume that g; and g; beiong to different sets do, and do,. 

If oj < ox, we put again f—gigx. Then 1(f) moreover 1(f-'g;) 
=1(9,"') =1, and as g; belongs to ¢o,, f belongs to Ro,41, by 3.12. But so 
does gi, and therefore also g, = gif, contrary to the assumption that 9; 
belongs to a later set than 9. 

If vi > Or, Wwe put Since L(9igx) == 1(f-gx) = l, gi? = 
belongs to ¢o,.:, contrary to the assumption that gi belongs to a later set 
than gx. 

Hence, gi and gx belong to the same set ¢o. But then, by 3.11, 
K(gitgx) Sl. As also 1(gigx) =1, gi is a transform, by 2.4. I.e. gi = te, 
and then 9; = t’s, or = fo, aS gx belongs to the same set. 


4.1-4.3 together give immediately : 


4.4. If gi and gx (gi!) are any two generators of equal length 1, 
and tf l(gigx) SI, then gi and gx belong to the same set $c, and gi =to or 
= and or for. U(gigx) <1 if, and only if, both gi and gx are 
transforms, gi=to, gx=t'o, and then gigx—=SoUgSo" is a U-transform. 


4.4 leads: us to the corresponding results for a word in more than two 
generators of equal length /: 


If P=J1 gp (r > 2), where U(gp) =1 for and if 
1 


no proper partial product of P is shorter than l, then l(P) Sl if, and only 
if, all generators gp belong to the same set $c, Jp = for p=2,- -,r—1, 
and g, and g, are such that l(gigr) SI. 1(P) <1 only if either also g, and 
gr are transforms, or if g:1—=g,—=for. In both cases P is a U-transform. 


Proof. Since the product gpgp., of any two successive generators is at 
least of length 1, it follows that in every such product the first [47] factors 
of gp and the last [47] factors of gp,, are not touched by cancellation and 
amalgamation, and if / is odd, the central factors at most amalgamate. In 
particular, in the whole word P, the first [47] factors of g, and the last 
[47] factors of g, remain untouched. If P is to have no more than / factors 
in all, it follows, that in every product gpgp.1 at least [41] factors must cancel, 
and if 7 is odd, the central factors must amalgamate, i.e. 1(9p9p.1) S/ for 
p=1,---,r—1. Hence, by 4.4, the first part of 4. 5 follows; and we have 


* Cf. Neumann, [3], p. 18. 


be 
hey 

ts, 
uct 
an 
gth 
ong 
OTK: 
of 
fo. 
> 1. 
po, 
re- 

the 
Ok. 
1ce, 
po; 
of 
ain 
Zut 
Gi- 

t's. 


504 HANNA NEUMANN. 


r-1 
P=g: 11 9p° 9, where g, = or = for, and gy = to” or = fo,r,. 


If now 1(P?) <1, then neither of the two generators g, and g, can be 
earlier than the other. For let g;, say. be earlier than g,. As1(P) <1, P can 
be expressed by generators shorter than /, i.e. generators which are earlier 
than all generators of ¢c. Hence P belongs to Ro, and, a fortiori, to the 
group generated by Ro and all generators of ¢o which are not later than g,. 


But to this group also the partial product 9g, Il to’ belongs, and therefore 


also gr, contrary to the assumption that g, is later than 9). 

But if neither of the two generators is earlier than the other, then either 
both are transforms, or one is the inverse of the other, and then it follows 
from the first part, that g-=for=gi". Ie. if 1(P) <1, then either 


r r-1 
P= II = SoU or P= for II for = lee Usk 
1 2 


5. Next we investigate the length of the product of a single generator g 
of length 7 with a word in generators which are all shorter than 1. 


5.0.7 If P=II1 9, and Maxl(gp) =1,, and if g is a generator of 
length 1>1,, then I(gP) =I(g) and I(Pg) = 


Proof. This is an easy consequence of 4.0: Let g belong to ¢c. All 
the generators gp, being shorter than g, belong to earlier sets, hence P 
belongs to Ro. If also gP, say, were shorter than g, this also would be 
expressable by generators of earlier sets than ¢o, i.e. gP, and therefore g, 
would belong to Rc, contrary to the fact that g belongs to ¢z. 


5.1. If, under the same assumptions as in 5.0, 1(P) >|, then 
l(gP) >1(g) and l(Pg) >1(g); i.e. gP or Pg can be of the same length 
as g only if P is of the same length as the longest generator involved in its 
representation. 


Proof. Let us assume that, e.g. 1(gP) =I(g), and 1(P) =A>|,. 
If \ = 2« is even, then by 1.12 exactly « factors of P cancel against g, and 
no amalgamations can take place. If A= 2«-+1 is odd, exactly « factors 
of P cancel against g, and one, the central factor of P, amalgamates. 


In the first case, A = 2«, let P. = IL gp (v2 S71) be the earliest partial 
1 
product of P such that P, cancels x factors of g in the product gP:2, while 


7Cf. Neumann, [3], p. 14. 


¢ 

] 

y 

i 

a 


1 be 
can 
the 


gi. 


fore 


ther 
ows 
ther 


or g 


GENERALIZED FREE PRODUCTS. 505 


Por 
P,=II gp cancels less than « factors in the product gP,. Such a partial 


product exists, as by assumption P cancels x factors of g. Since P, cancels 
p<« factors of g and possibly amalgamates one in the product gP,, but 
P, =P g,, cancels x factors of g, the remaining v factors of P, which are 
not affected by cancellation and amalgamation in the product gP,, must 
cancel against g;,, so thatthe following factors of g,, can cancel another 
x——yp. factors of g in the product gP,g,, Since by 5.0 1(gP,) 2=1(g), we 
have by 1.12: v= uy. 

In the product gP: = gP,g,,, altogether v + (x—y) factors of g,, have 
been cancelled. But since also 1(gP:) =1(qg), at least x factors of P, are 
left untouched in gP.; i.e., since P,; has been disposed of between g and J,., 
at least «x factors of g,, are left untouched in this product, so that 


=v + +4 
mA Sh, 


against the assumption that all generators gp are of length S1,. 


In the second case, where A= 2« +1, let P,—J[][ gp be the earliest 
1 
partial product which cancels « and amalgamates one factor of g in the 
ry-1 
product gP., while P,—][ gp either cancels less than « factors of g, or 
1 


cancels x factors, but then does not amalgamate a further factor of g in the 
product Pig. If P, cancels » factors of g, we have »=x«. If v factors of P, 
are left unaffected in the product gP,, we have again, because of I(gP;) = 1(g), 
v= 
These last v factors of P,; must cancel against g;,, so that then g,, can 
cancel another «—y factors of g, and amalgamate one, in the product 
9P19ro=9P>. Because of 1(gP2) =I1(g), there must be at least « factors 
left of g,, which are not touched by cancellation and amalgamation, so that 


which is the same contradiction as before. Hence 1(gP) =1(g) is impossible 
if 1(P) >1,. If l(gP) >1(g), but =1(g), the proof runs similarly, 
and can be omitted. 

We call a word in generators gp of length 
which cannot be expressed by generators which are all shorter than J, a 


of 
All 
be 
hen 
gth 
its 
‘ors 
tial | 
ile 


506 HANNA NEUMANN. 


‘prime word,’* if it is itself of length J, i.e. of the same length as the 
longest generator involved in its representation. We denote prime words 
by p,p’,- Then 5.1 becomes: 


5.2. If P=I19 is a word in generators of length I(gp) Sl 
1 


(ep =1,:--,7), and tf there exists a generator g of length l1(g) >1, so that 
either 1(Pg) =I(g) or I(gP) =1(g), then P=p is a prime word of 
length I(p) =|,. 


Finally we prove: 


FF P=T1 9p, l(gp) Sl, (p= and if g is a generator 
of length 1>1,, then l(Pg) >1(P) and Il(gP) >1(P). 


Proof. Again it will be sufficient to prove one of the inequalities only, 
the proof of the other one being similar. 


Let us assume that /(gP) =/1(P), and let P, =I 9p be the latest 


partial product of P such that I(gP,) >1(P:); but if P.—I[][ gp, then 


l(gP2) Such a partial product P,; exists, since by 5.0 
=1(g) >1(91). 

Since 1(gP,) >1(P,;), by 1.12 more factors of g remain untouched 
than cancel against P,, i.e. if 1 is odd, at least the first half and the central 
factor of g remain untouched, and if / is even, at least the first half of g 
remains untouched, in the product gP,. If we write {$/} [4I],”° 
this means, that at least the first {47} factors of gl’; are simply the factors 
of g, and if 1(gP,) = {41} + p, we have, because of 1(gP,) =1(g), » = [41]. 

Since 1(gP,) =1(P.2), cancels in this product gP2 = at least 
the second half of g, and if / is odd, it also amalgamates or cancels the central 
factor of g. Hence, in this product, g,,,, must cancel at least the last p factors 
of gP,, and, if 1 is odd, cancel or amalagamate one more. If in the product 
gP >= (gP:)gr+1 exactly factors cancel, we have = p, and by 1.10: 


l(gP2) = 1(gP;) + L(Gry+1) — 20 —e, 


where from the preceding discussion we know that «= 1, if o =p and 1 is odd, 


§ Cf. Kurosch, [1], p. 652. 


®° Cf. Neumann, [3], p. 16. 
10 While [41] is the length of the first or second half of an element of length J, 


{ 31} is the length of first or second half plus central factor. 


0 
t 
| 


ator 


nly, 


test 


hen 


191) 


hed 
tral 
tors 
41]. 
east 
tral 
tors 
luct 


dd, 


l, 


GENERALIZED FREE PRODUCTS. 507 


ie. {41} = [41] +1, but if e—0 and then must be even, i.e. 
{41} = [47]. It follows: 


L(gP2) = {31} +e +1 (Gres) —20 —e 
S [31] +1 + p+ —2p—1 
—2%p—1 
=1(9ry1) <1(g) 


contrary to 5.0. 


6. We deal now with words of the form giPgx, where gi and gx are 
generators of equal length J, and P is a word in shorter generators. 

It is here that the difference between the generalized free product and 
the ordinary free product becomes apparent. In the ordinary free product, 
every word of this form is longer than J, i.e. longer than the generators 9; 
and gx,’ whereas in our case its length can be less than, or equal to 1. 
This difference alone is responsible for the existence of non-trivial relations 
between our generators, as will be seen in the following paragraphs. In this 
paragraph we shall have to find out under what conditions giPg, can be 
shorter than g; and gy. 


6.0. If at least one of the inequalities 1(giP) >1=1(gi) or l(Pgx) >1 
=1(gx) holds, then > 1. 


Proof. If, e.g. I(giP) >1, and if 1(P) =A, then by 1.12, less than 
[4A] factors of P cancel against gi, and at most [4A] factors cancel and 
amalgamate. Besides, by 5.3, l(giP) >1(P); hence at least the first {41} 
factors of gi remain untouched in the product giP. And as also (Pg) > U(P), 
also at least the last {$1} factors of g; remain untouched in the product Pgx. 
Moreover, 1(Pgx) =1(g:) by 5.1, which means that at most [4A] factors 
of P cancel against gx, and at most {4A} factors cancel and amalgamate. In 
the whole product giPgx, at least one (possibly amalgamated) factor of P 
remains therefore between the first {31} factors of g; and the last {4/} factors 
of gx, i.e. U(giPox) = {41} + 14+ > 1. 


6.1. If l(giPg.) Sl, then P=p is a prime word of length I(p) < l, 
and gi and gx belong to the same set $a so that, moreover, l(gigx) Sl. 


Proof. From 1(giPg.) S1, together with 6.0 and 5.0, it follows that 
l(giP) =1=lU(P x), hence, by 5. 2, P = p is a prime word of length l(p) < l. 


11 Cf., e.g., Neumann, [3], p. 17. 


2 


the 
that 
| 


508 HANNA NEUMANN. 


Let us assume that g; and gx belong to different sets ¢o, and ¢o,, and, 
without loss of generality, that oi < ox. 


gi is either of the form gi = to, or = fo,r, or = fo,r- 


In the first case, we put f = gipg:. Then either 1(f) <1, in which case 
f belongs to Rc, As p is also shorter than /, p also belongs to Rc,. Hence 
gx = pgi'f belongs to Ro,.1, contrary to the assumption that gx belongs 
to a set later than ¢o,. 


Or l(gipgx.) =1(f) =1. But then 1(f-'gi) =1(gi7f) =I, i.e. 
by 3.12, f belongs to Ro,.1; so does gip, and therefore also g;, which is the 
same contradiction as before. 

In the other case, where gi! = fo,r, we put f = pgx. Then 1(f) =/ and 
l(fgi?) = 1, hence again f belongs to and therefore also gx does, in 
contradiction to the assumption that g; belongs to a later set than gi. 

Therefore, gi; and g; must belong to the same set do. It remains to be 
proved that SJ. That > 1 can happen only in the following 
_ two ways (but for trivial variations) : 


either gi =to or and gx = 


or = for, and gx = for, 


In the first case, with f = pgx, we have I(f) = 1, U(f-'gi*) and U(fg.) Sl, 
where both, gi-* and g;,"1, are generators of ¢o beginning with Sc. Hence, by 
3.12, f= is itself a generator of do. But then g; = cannot be a 
generator later than ?¢’o, contrary to gx—=for,’. In the second case, if 
gi ~ gx", one of them is earlier than the other, g; earlier than gx, say. We put 
f=gipg.. If f were shorter than /, f would lie in Re, hence gx = p"'gi'f were 
not later than g;. Hence f is of length 1. But also 1(gi7f) =1(fgi) =1 and 
L(fgx?) =1, where both gi and g,"1 are generators of do beginning with So. 
It follows again from 3.12, that f—to, and is 
expressible by generators not later than gi, which is a contradiction. 
Therefore, in fact, 1(gigx) 1; which completes the proof of 6. 1. 
Under the assumptions of 6.1, we have I(gip) =I(pgx) =I and 
l(gigx) Sl, ie. =U(gi) and 1(gigx) S1(gx), hence by 2. 11 
l(pgx) S1(gx). As also I(pgx) =1(gx); it follows from 2.4 that p is a 
transform. Closer examination shows that p is a U-transform. This is seen 


as follows: 


Because of 1(gigx) <1, we may write representations 1.0 of g; and g, 


then 
wher 


trans 


Gi 
T 

D4 
th 

gi 
its 

if 
ga 

fa 
i.e 

If 

at 

p 
of 

Pri 

sho 
sam 
whe 

and 
a we 

is § 

isa 


nd, 


GENERALIZED FREE PRODUCTS. 509 


g=S(g)C(Gi)T (Gi) and (gx), in such a way that 
T(gi) =S(gx)*. Besides C(gi) and C(gx) belong to the same group. 

By 5.3, l(pgx) >1(p), so that C(gx)T (gx) is unaffected in the product 
pgx- But we also have /(pgx) =1—=1(gx), i.e. the first half of pg, is simply 
the product of p and S(g;). Therefore, a representation 1.0 of pg; may be 
written in the form pg; = (pS(gx))-C(gx)T (gx). Now the whole product 
gipgx is a product of two factors of equal length 7 (viz. gi and pgx), and 
itself at most of length 7; hence T'(gi) cancels against S( pg.) = pS(gi), and, 
if 1 is odd, the central factors C(gi) and C(pg:) =C(gx) cancel or amal- 
gamate, i.e. the product 7(gi)pS(g-) is an element of U, if the central 
factors of gi and gx belong to ©;.—WU,. Hence, for any 1, we have 


T(gi)pS(gx) =U in U, 


p=T (gi) (gx)-* = 8 (gx) US (gx). 


If is odd, U = Ug, and p= S(gx) (gi)? is a U-transform; if is even, 
at least one factor each of S(gx) and S(gx)-* must cancel or amalgamate in 
as <1; therefore U = Ug where is the group 
of the last factor of S(g.), and p= is again a U-transform. 

We call a prime word which is a U-transform, a “ prime 1-transform.” 
Prime U-transforms will be denoted by z,7’,---. Then we have shown: 


6.2. If gi and gx are two generators of equal length 1, P is a word in 
shorter generators, then 1(giPgi.) S1 if, and only if, gi and gx belong to the 
same set do and I(gigx) =1, and P is a prime \-transform, of length <l, 


P = =S(ge) aS (gu), 


where « indicates the group of the central factors of gi and gx tf 1 1s odd, 
and the group of the last factor of S(gi-), if 1 1s even. 


If in 6.2, gimg, is actually shorter than 1, then girg,—= FP’ is itself 
a word in generators which are all shorter than 7, moreover gi*P’gx* = 
is shorter than 7. Hence, 6.2 applied to gi7*P’gx gives that also P’ =7’ 
is a prime ll-transform, and also I(gi7'g,-1) Sl. Using 4.2 and 4. 3, we get 


6.30. If, wnder the same assumptions as in 6.2, we have 1(giPgx) <1 
then either gi? = gr (€ +1), and then 
where both, and x’, are prime \l-transforms shorter than l, or gi=to, 
= and then giPgi = tort’os where and x’ are again prime U- 


transforms shorter than 1. 


ase 
1¢e 
1g8 
he 
in | ie. 

be 
l, 
by 
a 
if 
Tre 
is 
1 
a 
en 


510 HANNA NEUMANN. 


For later reference, we give the explicit representation of + and x’, which 
is easily checked by means of 6.30 and the preceding discussion: 


6.31. (i) I odd: 

If =9 =for=SoCoT or, then for 'afor =n’, where r= SoU So", 
a7, with Cop Uglor = U' a; if gi=te = SoCo8o", 
= SoC’ then = SoU = SoU’ So, with = 


(ii) 1 even: 


Tf git? = 9x = for = SoUT or, where the last factor of So lies in Gg — Ug, 
and the first factor of Tor lies in Gy—U,, then for «for =n’, where 


r= SoU pSo"', = ler" UT or with = U, if Ji= to = SoU So", 
then = SoU r= SoU’ with UU = U's. 


7. After these preparations, we are able to consider arbitrary words in 
our generators. Our aim is to obtain a relation between the length of the 
word as element of , and the length of the longest generator involved in its 
representation. But if any word 


7.0 Max I(gp) =1, 
p=1 


is given, it may yet be thet P can also be expressed by shorter generators 
only—examples for this are the situations of 4.5 and 6.3—and then we 
cannot expect to get an estimate of the length of P in terms of the length 
of the longest generator in this particular representation. 

The lemmas 4. 5, 6.2, and 6.3 provide some means to reduce, possibly, 
the number of longest generators in the representation 7.0. In this and 
the following paragraph it will be shown that these three lemmas suitably 
applied represent all the possibilities of disposing of generators of maximal 
length in a word 7.0. 

We refer to a process which reduces the number of longest generators 
in the word 7.0 as a ‘reduction’ of the word 7. 0. 

It will be useful to list all the different types of reductions based on 
4.5, 6.2, and 6.3. To this end, we write 7.0 in the following form: 


P=Il9—UPy (r =n), 
p=1 


y=1 


where every factor Py is either a generator of length 1, or the partial product 


a 


for 


wh 
gel 
sat 


p 
t 
t 
P 
al 
i, 
T 
pr 


hich 


tors 
we 
rgth 


ibly, 
and 
ably 
imal 


tors 


on 


duct 


GENERALIZED FREE PRODUCTS. 511 


of all the shorter generators preceding the first, or following the last generator 
of length J, or between two successive generators of length 1. 

Moreover, we assume that in 7.1 a generator of length 7 which is of the 
form Py=tos is neither followed nor preceded by a prime U-transform 
= 27 = SoU,So'. For in such a case, the products tor or rte are both 
of length 7, and therefore themselves generators t’c. Hence the products 
= tow or can be replaced by the one factor 

We begin the list of reductions with the obvious one, viz. replacing a 
number 7 = 2 of successive generators of the form to"), te’),- - + by their 
product. This is either of length 7 and then itself a generator to, or it is 
shorter than 7 and then, by 4. 5, a U-transform 7, as e. g. for every generator 
ts we have [(ztc) =I(tc). The first reduction is therefore: 


(i) r=2 successive factors = to?’ = SoCo So! 


are replaced by their product [[ Pip = te or = 2 = SoUSo". 
p=1 


If three successive factors are all generators of length J, but not all of 
the form ts, then by 4.5 their product can be shorter than 7 only if 
Py. =for', Py=to, = for, and then 


= Tor UeT or 
is a prime ll-transform, by 6.2, as foros = to is of length J. But here 
also to is a U-transform (hence a prime ll-transform), for 
to — fol Tor So(CorUaCor) Sa 

i. Co C (te) Mes. 
The second reduction is therefore: 

(ii) Three successive generators fo; ", to, for are replaced by their 
product, a prime ll-transform: 


for tofor ro" with Usa = Cos 


This same relation gives rise to another reduction if we write it in the 


form 


tofor form or for tte = afor; 


which replaces two generators of length 7 by only one and a word in shorter 
generators. And the same reduction is still applicable even if not to itself 
satisfies the above relation, as long as it can be written in the forms te = 7,t’e 


5 
Us, 
here 
» 
U's. 
s in 
the 
1 its 
), 


512 HANNA NEUMANN. 


or to = t’om, respectively, where 7; = SoUg,S,"', and t’o satisfies the above 
relation. Therefore: 

(111) Two successive generators of length 1, Py=t, and Pv. = for, 
are replaced by their product 


PoP va tofor miform with = SoU Tt = les" ors 
and correspondingly Py = and = te are replaced by 
PyP = for tte = with and as before. 


Combining the two reductions (iii) leads to a reduction of the form: 


(iv) Three successive generators of length 1, Py, = for, Pv = te and 
Pyvi41 = for, are replaced by their product 


Py PvP for, ttofor, tifor, 


By 6. 2, a product of the form giPg, will be itself a generator of length | 
only if gi = for, gx = for, P = pT oz, and then = tc, where 
tc is a prime U-transform as in (ii). As in this form two generators are 
replaced by one, we list it as a separate reduction: 


(v) Three successive factors of the form Py_; = for, Py =a = To, U pT or, 
= for? are replaced by their product 


= formfor™ = ts = with Ce = CorUgCort. 


Finally, a product giPg, which is shorter than 1, can be replaced by a 
word in shorter generators only. By 6.3, and our convention that a generator 
tc is neither followed nor preceded by a factor Py =2=—SoU,So", this 
leads to the following reduction: 

(vi) Three successive factors of the form Py. =for*,Py=% 
= 8(forf) (forf), and = for’ (€—= +1) are replaced by their 


product : 
=m (for‘) UpT (for*) 


The process of reducing 7.1 by means of the reductions (i)-(vi) will 
come to an end either before all generators of length / in 7.1 are eliminated, 
or when all generators of length / have disappeared. In that case we are left 
with a representation of P by means of generators of length <1, < 1 only. 
We repeat the process with respect to the now longest generators of length |, 


ge: 


Q 
I 
re 
th 
ris 
sil 
ge 


OVE 


for, 


and 


GENERALIZED FREE PRODUCTS. 513 


and so on. After a finite number of steps, no further reduction will be 
possible. A representation of P thus obtained is called a reduced represen- 
tation of P, or simply a reduced word P. From now on we assume that 7. 1 
is a reduced word, where the length of the longest generator is again 1. 

The following properties of reduced words follow immediately : 


n 1 
7.2. With Py also the formal inverse is a reduced 
n 
word. 


7.3. Every partial product Il Py (n= 1, > % = 1) of a reduced word 


no 


n 
which contains at least one factor Py (nx SvSm) which is a 
1 


generator of length 1, is itself a reduced word. 
7.4. If Qi and Q»2 are words in generators which are shorter than lI, 


n 
and P=J]P, a reduced word, then the reduced representation of Q,PQ. 
1 


=@Q,[] Pv: Q2 contains the same number of generators of length 1 as P. 
1 


Proof. The reduced representation of Q,PQ. is obtained as follows: if 
P, is a word in shorter generators, combine Q, and P, into one factor 
Q,P,=P’;. The only case in which this may give rise to a further change 
is where P’, is of the form 7 = ScU,So" and Pz is a generator of the form te. 
In that case we have to combine P’, and P, into one factor t’,. This does 
not change the number of longest generators in P, as it cannot cause a 
reduction ; any reduction caused by replacing to by t’, would have to be of 
the type (iii) ), but then already the original generator fs would have given 
rise to this reduction, contrary to the assumption that P was reduced. A 
similar argument shows that Q. has no effect on the number of longest 
generators in P. 

We are now going to prove the decisive property of reduced words: 


7.5.22 Jf P= II Py, is a reduced word, and if the length of its longest 
generator is 1, then 1 (P) onl 

Proof. We bring this proof in several steps. 

1. By 4.1, 5.0, and 5.3, any two successive factors Py and Py,, cancel 


at most half of each other; and if Py is a word in shorter generators, hence 


12.Cf. Neumann, [3], p. 19. 


ere 
are 
OT; 
y a 
tor 
his 
eir 
rill 
d, 
eft 
ly. 


514 HANNA NEUMANN. 


Py. and Py,, generators of length J, Py cancels definitely less than the first 
half S(Pv.:) of and less than the second half T(Py,) of In 
particular, if P,; is a word in shorter generators, hence P, the first generator 
of length J, then P, affects less than the first half S(P.) of P., and the 
product P,-S(P.) is at least of the same length [$1] as S(P:). Corre- 
spondingly, if P, is a word in shorter generators, hence Py_, the last generator 
of length /, then less than the second half T(Pn-_,) is affected in the product 
PniPn, and T(Pn+)+Pn is at least of length [41]. 


2. If the first or second half of a generator Py of length J is cancelled 
completely by the factors preceding, or following, Py then the second or first 
half respectively of the generator of length ] preceding, or following, Py is 
thereby cancelled as well. For if, e.g., S(Pv) is cancelled, and if Py_, is 
also a generator of length J, then, as any two successive factors cancel at 
most half of each other, S(Pv) must be cancelled by T(Py-,). But if Pv, 
is a word in shorter generators, hence Py_, the generator of length / preceding 
Py, then Py_, does not cancel the whole of S(Pv), hence Py, must have been 
disposed of between the generators Py. and Py, so that then the former can 
cancel the rest of S(Pv). But then we have the position of 6.2, i.e. 
T (Py-2) = and = 27 = and both, S(P,) and 
T(Py-2), cancel completely. 


3. If Py is a generator of length / of the form Py = f.,**, then if 1 is 
even, the half Tc, cannot be cancelled completely, and-if J is odd, Co;*' 
cannot be altered by the factors following Py (if Py =for) or preceding Py 
(if Py for"). For let, e.g., P»=for. By 4.2 and 4.3, there is no other 
generator g, i.e. no generator gfor", of length 7 such that l(forg) S|, 
or l(forg*) SI. Hence, if the second half To, of for should cancel com- 
pletely, and, if 7 is odd, the central factor Co, at least amalgamate, fo, must 
be followed by a product Py,, of shorter generators and a generator Py,» of 
length 1, which together cancel 7'c,, and if J is odd, amalgamate Co,. But 
then we have, by 6.2, Pyro = Pv = for and = To, 1UgT or, and P 
would have been reduced. Similarly it follows in the case that Py = fo,™. 


4. Now we show: in the whole word P, the first half of the first generator 
of length J, and the second half of the last generator of length /, are not 
touched at all by cancellation or amalgamation. And, if J is odd, the central 
factors of the first and last generators of length J can, in the whole word P, 
at most amalgamate, but never cancel. 

Again it will be sufficient to prove this assertion for the first generator 


t] 
if 
ti 
le 
( 

W 
01 
a 

al 
pc 


GENERALIZED FREE PRODUCTS. 515 


only, the proof for the last generator being analogous. If the first generator 
of length J is of the form fo;, we saw under 3. that in the whole product P 
cancellation and amalgamation can touch at most the second half Tc;, while 
the first half and the central factor remain untouched. 

If the first generator of length / is of the form f,; or to so that its 
second half is So’, and if this half is cancelled completely, then, by 2., this 
is done by the next following generator of length J. 


Either this is of the form fo;,. Then it follows from 3. that the next 
following generator of length 7 can have no influence on what happens to 
the first. If then the first half of the first generator, or, if J is odd, its 
central factor, should be touched at all, the partial product fo; "Psfor, (P3 a 
word in shorter generators) or Tolsfor, respectively of P must be of length 
</. If it were </, P would not have been reduced by 6.3. Hence it is 
of length 7, i.e. if J is odd, the central factor of the first generator amalga- 
mates, but does not cancel, and if ] is even, no amalgamation can take place 
between the first half of the first generator and the second half of the second 
generator. 


Or else the second generator is of the form ?¢’c, and in order that it 
cancels the second half of the first generator, a factor between the two would 
have to be a prime lU-transform of the form S,U Sa", by 6.2; but this 
possibility has been excluded. Hence the first generator of length 7 is followed 
immediately by ¢’c, and as P is reduced, the first generator can only be of 
the form fo;?. In the product fort's, S(for!) is not affected at all, and 
if 1 is odd, the two central factors amalgamate, but do not cancel. Cancella- 
tion, therefore, could only be achieved by the next following generator of 
length 7, which then would have to cancel the second half, and amalgamate 
(at least) the central factor, of ?¢’, first. For the same reason as before, no 
word in shorter generators can occur between the second and third generator 
of length 7, hence the third one must be of the form fo; and we would have 
a product of the form fo;l’cfor. As P is reduced, this must be of length 
= 1, i.e. S(for?) is not touched, and if J is odd, the central factors at most 
amalgamate. Moreover, by 3., the factors following for cannot alter this 
position any more. 

Now it follows from 1. and 4.: 


if 7 is even: 1(P) = [41] + [47] =], 
if is odd: 1(P) = [41] +1+ [41] =], q.e.d. 


Now the main result of this paragraph follows easily: 


rst 
In 
tor 
the 
re- 
tor 
ict 
led 
rst 
is 
is 
at 
ng 
nd 
is 

J 
Py 
her 
l, 
ym- 
ust 
of 
3ut 
P 

tor 
not 
ral 
P, 
tor 


516 HANNA NEUMANN. 


7.6. If the longest generators in a reduced word P are of length 1, 
then P cannot be expressed by generators which are all shorter than 1. 


Proof. Let be the reduced word whose longest 
Ip . 


generators are of length 1. If P, cdl gp with 1(g’p) <1 for all p were 
1 


another representation of the same element of 9, then P,;*P=—1. But P, 
is a word in generators which are all shorter than 1, hence, by 7. 4, a reduced 
representation of P,P contains the same number of generators of length / 
as P. Hence, by 7.5, contrary to =1. 


8. From 7, in particular from 7.5 and 7.6, we derive a complete 
description of a system of defining relations for the generators of §. 


If R=II gp with Maxl(gp) =I is a relation between the generators 
1 


of our system ¢, i.e. R = 1 as element of §, the reduction of R cannot leave 
any generators of length /, by 7.5. The same is true for the reduction with 
respect to the then longest generators, and so on. Hence, after a finite 
number of steps, the right-hand side will become identically one by means 
of the reductions (i)-(vi). It follows: 


8.0. THeEorEM. Every relation between generators of the system 
follows from relations of one of the following two types: 


8. 01. Il to”) == x where r = SoU,Sq" is a prime U-transform of length 
p=1 
I(r) <1(tc), or the unt element; 
8.02. for*T for where or T= 
= SoU and = Tor pT or. 


(Here, if (tc) or (for) respectively is odd, a indicates the group of the 
central factors of the generators of the set $c; if this length is even, « indi- 
cates the group of the last factor of Sc and f that of the first factor of Tor.) 


Proof. All that remains to be proved is that the reductions (i)-(vi) 
follow from relations of the form 8.01 and 8.02. This is almost obvious, 
and we shall only state on which type of relation the single reductions are 


based : 
(i): 8.01. 


(ii): 8.02 with T= to. 


le 
ge 
W 
fo 
in 
m 
0 
OF 
is 
be 
pr 
to 
pr 
i. ¢ 
of 
8. 
she 
it 
iny 
fo 
pri 
cor 
wo 
pri 


GENERALIZED FREE PRODUCTS. 517 


(iii) and (iv): 8.01 in the form (SoU = t'o and 
8.02 with T = t’s. 


(v): 8.02 with 
(vi): 8.02 with 


[If G is a free product with one amalgamated subgroup, all the elements 
Ua, Ug belong to the one amalgamated subgroup U. Transforms of even 
length do not exist, i.e. a set $, of generators of even length contains 
generators for only, and relations involving these must be of the form 8. 02 
with 7 = rr. | 

In the general case, a relation 8.02 with T —7 and I(fo,) even entails 
for the central factors Co;'UgCor = Ug(Co, in U) where Uy, and Ug will, 
in general, be different groups. But this does not mean that U, and Ug 
must both beloitg to the meet Ugg of U, and Ug—as one might perhaps hope. 
One easily constructs examples to the contrary by taking U, = Ug, in 
= Ua Uy, and Ug = Ugy in Ugy = Ug 14, so that one may choose 
Cor as an element in 11,, which, in U,, transforms Vay into Ugy. And this 
is not the only possible type of example. 

For the system of generators as it is at present, these relations cannot 
be simplified any further. However, in these relations there occur certain 
prime U-transforms, whose explicit representation by means of shorter genera- 
tors we do not so far know. We only know particularly simple examples of 
prime U-transforms, viz. those generators of the form to whose central factor, 
i.e. an element of a group Ga, or of Ul, is conjugate to an element of UW, (or 
of any group llg in the latter case). It cannot be expected that every prime 
U-transform is of this form. In fact, not even those which occur in relations 
8.0 need be generators. But in the following paragraphs we are going to 
show how, using the freedom left in the selection of the generators (cf. 3), 
it is possible to choose them so that their defining relations 8.01 and 8. 02 
involve only prime Ul-transforms which are themselves generators of the 
form to. 

To this end, we have to find out first which reduced words represent 
prime U-transforms. The next paragraph deals, more generally, with the 
corresponding question for prime words. The incidental results on prime 
words will be used throughout the later paragraphs. 


9. From 7%.6 and the definition of a prime word it is clear that the 


prime words of length 7 are those reduced words P = J] Py whose longest 
1 


l, 
rest 
ere 

Pj 
ed 
h 
ete 
ors 
ive 
ith 
ite 
ins 

jth 

T 
he 
li- 
re) 
i) 

5, 


518 HANNA NEUMANN. 


generators are of length 7, and which themselves are of the least possible 


length 7. 
We denote by (mn) all the generators of length 7 in the 


order in which they appear in the reduced word P=J[][ Py. If then P is to 
1 


be of length /, the whole first and second half of all the generators go,° - -, Jn 
must cancel, and if 7 is odd, their central factors must amalgamate. By 4.5 
and 6.2, this is possible only if all generators gu (u—1,:--+,m) belong 
to the same set ¢o, and T (gn) = S(gy.;)* for »=1,: -,m—1. Besides, 
the product of shorter generators between gy and +,m—1) 


can be only the unit element, or a prime U-transform ap = T (gp) (gx). 
As P is reduced, it follows that m < 3, and if either gu or gus: (or both) 
are generators of the form to, then certainly 711. Finally, the products of 
shorter generators preceding g;, and following gm, must not increase the length 
of the whole word, i.e. they must be prime words p, and p, respectively of 
length <1, so that 1(pig:) =1(gmp2) =1. Now a distinction between the 
three possibilities for m gives immediately : 

9.0. <A reduced representation of a prime word p of length 1 is of one 
of the following types: 

9.01. p=—pigps, g a generator of length 1; 

9.02. p= pitoforpe; 

9. 03. p= Pifor, a= or =1 

9. 04. P= Pifor, 3 
where in all cases p, and p. are prime words of length <1 with 1(pig:) 
=1(gmp2) = 1. 

The first and second half of p depend only on p, and 8(g,), and T'(gm}j 
and ps, respectively. In particular, in order that there should exist 2 
generator g’ of the same length 7 as p such that I(g’p) SI or I(pg’) SI, 
the second or first half respectively of g’ must cancel against p, and therefore 
against g, Or gm respectively (as shown in the proof of 7.5). Hence by 6. 2, 
also belongs to and in the first case py = 7, = S(g,)Ua,S(gi)* (or 
and in the second case ps (gm) (or p2=1) 
are prime l-transforms. 


Conversely, if p, 71, OY P27. are of this form, then there exists 4 
generator g’ of length J such that I(g’p) S/ or 1(pg’) SI respectively: one 
may simply take g’ = 4g," or g’ = respectively. 


—_ 


i 
f 
w 
0 
b 
l 
q: 
t 
ge 
fa 
st 


vé 


GENERALIZED FREE PRODUCTS. 519 

If g: or gm is of the form te, then, of course 7, —1, or 7.1 respec- 

tively, since the representations of p are reduced. But if g, e.g. is of the 
form for, and p, =m = we have 


for,mfor, So(Cor,Ua,Cor, 1) So, 


hence 7, is a l-transform, and either T7,—t,, or T; 
according as 1(7,) =/, or 1(T,) <1. Hence we can replace mfor,*_ by 
for, 771, and then combine 7, in the representation 9. 02, 9.03, or 9.04 with 
the next following factor which is of the same type as T;. Correspondingly, 
if gm = for, and po = m2 = we replace by T 

The representation of p thus obtained is in general no longer reduced, 
but has the advantage that it shows immediately whether there exists a 
generator of the same length J as the prime word p whose product with p 
is at most of length 7. We call this the “ principal form ” of the prime word p. 


9.1. A principal form of a prime word p of length 1 1s of one of the 
following types: 


9.11. p= qitoge, 

9.12. pt = qil forge, 

9.13. p= qifor,T forge, 
where T =to or T= 27 = ScU,S8o"', or T =1; and q, and qz are prime words 
of length <1 such that 1(q,9:) =U(gmq2) =1; qi, and qz may be the unit, 
but qr AS( 91) and T (gm) aT (gm). 

9.2. There exists a generator g of the same length 1 as p such that 


l(gp) =I or l(pg) =1 if, and only if, in the principal form of p, q: =1 or 
= 1 respectively. 


We call q, and gq, the first and second prime factor of p respectively. 
In order to find the representations of prime U-transforms, we start with 
two lemmas which show that prime words behave very much like the actual 


generators: 


9.3. If p and p’ are two prime words of length | whose first prime 
factors in a principal form are one, and if P is a word in shorter generators 
such that I(pPp’) then S1 and P=x=S(p)U,S(p)* is a 
prime U-transform of length <l. 


Proof. By 9.2, there exist generators g and g’ of length 1 so that 


he 
5 
1g 
) 
) 
of 
h 
of 
) 
) 
a 


520 HANNA NEUMANN. 


l(g*p) SI and I(g"*p’) Sl. Hence p and p’ can be written so that 
S(p) =S(g) and S(p’) =S8(qg’); and if J is odd, also the central factors 
of p and g, and p’ and g’ belong to the same group. 

In the products g-"P and Pq’ less than half of g-* and g’ respectively are 
cancelled by P (by 5.3); therefore the same holds for p! and p’ in the 
products p*P and Pp’. If p*Pp’ is to be of length <1, P must therefore be 
disposed of between p-' and p’, so that the whole first halves of p and p’ can 
cancel against each other. But then we have with / (p'Pp’) SI also 
l(g*Pg’) hence by 6.3 1(g-'g’) Sl, i.e. and 


P = = 8(p) (p)". 
An immediate consequence of 9.3 is 


9.4. If p and p’ are any two prime words of length 1 such that 
l(p'p’) Sl, then there exist principal forms for p and p’ whose first prime 
factors are equal, and whose first generators of length l, g, and g’, respectively, 
satisfy 1(g:-'9'1) SI. Correspondingly for the second prime factors and last 
generators of length 1 of two prime words satisfying I(p’p*) S1. 


Proof. Let p= qip and p’ = be principal forms for p and p’, i.e. 
p and p’ the product of all the generators following the first prime factors 
gq, and q’; respectively. Then p and jp’ are also prime words of length J, 
and I(p*p’) where qi‘q’: is a word in shorter 
generators, and p and j’ are prime words whose first prime factors are one, 
i.e. they begin with generators of length 1, g, and g’, respectively. By 9.3 
we have 1(j-p’) <1, hence also SJ. And: 


And in the principal form p’=q'‘;p’, + can be combined with the first 
generator g’, of j’ as before. But then p’—4q.i(zp’) has the first prime 
factor qi. 

From 9.1 together with 9. 4, the latter applied to the case that p-* =p’, 
one obtains the principal forms of a prime lU-transform of length 1: 


9.5. A principal form of a prime U-transform 7m of length | is of one 
of the following types: 
9.51. Qitogi? where te = Sa(CU,C*)So™ ts a U-transform, 


9.52. where T =te = So(CU.C*)S,* or T= 7, 
= SoU So". 


ext 


ger 


are 


Th 
as 
the 
vel 
len 
for 
by 
pro 
gal 
the 
of 
Ps 
ma 
Bu 
i.e. 
amé 
wo 
for 


GENERALIZED FREE PRODUCTS. 521 


And from 9.2: 


9.61. There exists a generator g of length 1 such that I(rg) Sl or 
SL, tf, and only if, q, = 1. 


9.62. aw=—=t, ws itself a generator of length 1, if, and only if, there 
exists a generator g =t'c, or g =for, such that I(rg) Sl. 


Proof. 9.61 is nothing but a restatement of 9. 2 for prime 11-transforms. 
The generator g in 9. 61 can either be of the form fo,", i.e. begin with To, 
as first half, or of the form ¢’s or fo;, i.e. begin with So as first half. In 
the first case, afo;-' = tofo; is certainly of length >1, by 4.4. And con- 
versely, if g = or for, and = fo, then I(rg) Sl. 

Some further properties of prime words, again similar to those of 
generators, will be needed presently. 


9.7. If pis a prime word of length l, P a word in generators which 
are all shorter than 1, then l(p) S1(Pp) >1U(P) and l(p) Sl(pP) > U(P). 


Proof. It is sufficient to prove one of the two inequalities, the first say. 

By 7%. 4, a reduced representation of Pp contains as many generators of 
length J as that of p; hence, by 7.5: 1(Pp) =l=lU(p). 

To prove 1(Pp) >1(P), let the first prime factor of p in a principal 
form be q:, i.e. p = ip: where also p, is a prime word of length / such that, 
by 9. 2, there exists a generator g of length / satisfying 1(gp,) SJ. In the 
product Pp=Pq.p,, Pq: is a word in shorter generators, hence by 5.3 
i.e. Pq: cancels less than S(g), and cancels and amal- 
gamates at most S(g) in the product Pq,:g. As we may assume 
8(g) =S8(p.), the same is true for p, in the product Pq,- p,, i.e. at least 
the last {47} factors of p, remain untouched and, if / is even, the last factor 
of S(p,) at most amalgamates. But /(pp,-t) =1(q,) <1, and both, p and 
f1, are of length 1. Therefore, at least the last {41} factors of p and p, 
may be assumed to be identical and, if 7 is even, one more factor amalgamates. 
But then, in the product Pq,-p, = Pp, at least the last {$1} factors of p,, 
i.e. of p, remain untouched and, if J is even, the last factor of S(p) at most 
amalgamates; hence in fact 1(Pp) >I1(P). 


9.8. If g.=t, or =fo, is a generator of the set do, and p a prime 
word shorter than g, such that 1(pg:) =1(g:), then also 1(pg.) =1(g2) 
for all other generators g.= t's or = for of do. 


Proof. As p is a word in shorter generators we have certainly 


lat 
rs 
re 
he 
be 
un 
so 
t 
e 
st 
l, 
3 
t 
e 


522 HANNA NEUMANN. 


l(pg2) =1(g2) by 5.0. On the other hand we have I(pg,) =I(g,) and 
1(g:*g2) S1(g2), hence by 2. 11: U(pgz) S (gz), and therefore I(pg2) = 1(9:). 


9.9. If g ts a generator of ae l, pa prime word of length 1, <1 
whose principal form is p=i9i* where and I(gu) =1,, 
then with I(pg) =1(g) also 1(qog) =1(gmgeg) =1(g), and with = 
also =1(9q1° 91) =1(g). 


Proof. It is sufficient to prove one of the two lines, the first say. If 
we write p=jq2, then p itself is also a prime word of length 1, <i. 
Moreover I(q2) <1,; hence, by 2.2, it follows from I(pq.) =I(p) and 
l(pqe'9) =1(g), that also l(q.g) S1(g). But by 5.0 1(qog) 21(g), hence 
l(qog) =1(g). U(gmqgeg) =! follows in exactly the same way in the case 
m > 1, where p can be written in the form p = p(gmq-z) with p itself a prime 
word of length But if m—1, i.e. p= 4qigig2, we proceed as follows: 


If we write p—=qp, i.e. then 1(p) =Il(p) <1(g) and 1(q,) 
=I1(q:") <I(p); hence I(pp*) <1(p*) and =I1(g), and therefore 
by 2.11, applied to g: 


9) 
But by 5.0 21(g), hence 9) =1(g). 
10. Up to now, we do not know how far the reduced representation of 
an element is unique. It seems best to deal with this question here in its 
natural context, although the results will not be needed until later. 


We cannot expect the reduced representation P =[[ Pv of an element 
1 


of § to be unique in the sense that two such representations 7.1 of the same 
element are identical. For if, e.g, a factor Py is a generator of the form 
P, = to, and Py,, a word in shorter generators, hence Py,; ~ScU,So', then 
we can replace Py by = t's, and Py,, by where is any prime 
U-transform of the form += SoU,So"1, and the resulting representation of 
P will still be reduced. We are not going to examine such possibilities in 
detail. All we shall need to know is, to what extent are the longest generators 
uniquely determined? We restrict ourselves to answering this question. 

We begin by comparing two reduced. representations of a prime word. 
The following proposition is analogous to the result 6.2 for two generators 
of length 7: 


10.0. If pand p’ are two prime words of length 1, P a word in generators 
shorter than l, such that I(p*Pp’) <1, then any two reduced representations 


anc 


whi 


Mo 
of 1 
sec 


of 
0 
he 
wc 
a 
He 
ar 
A 
in 
ha 
He 
By 
wh 
it 
one 


ns 


GENERALIZED FREE PRODUCTS. 523 


of p and p’ are of the same type (t.e. one of the four types 9. 01-9. 04), the 
number of longest generators gy and g’, involved is the same for both, more- 
over tt holds: tf gx =for®, then and tf = SoC then 
= = SoC’oSa", where differs from Co at most by right- and left- 
hand factors in Ug, or of the form Co,UgCor". 

Proof. We write the reduced representations of p and p’ in the form 
p= and p’ = where p,, p’, and ps, p’, are the first and second 
prime factors of the reduced words p, p’ respectively, i.e. all of them prime 
words of length < /, and g and q’ are prime words of length 7, both beginning 
and ending with a generator of length 7. Then we have 


Pp’ = (p2*q*) (pi *P p's) (q'p’2) = with 1(P’) <1. 
Hence also P’ is a word in generators which are all shorter than /, and so 
are and p.P’p’.-? in 
(pat Pps) = poP 

As qg and q’ are prime words of length /, whose first prime factors are one 
in a reduced representation (which then is also their principal form), we 
have by 9.3 

10.01. S1, and = = 8(q)U.8(q) is of length < 
Hence, using q(p2P’p’.')q’"t = in the same way, we have 

10.02. U(qq’*) S1, and = = is of length < I. 
By 10.01 and 10. 02, p-*Pp’ = P’ becomes 

10. 03. 


where g and q’ contain the same generators of length / as p and p’, so that 
it is sufficient to prove 10.0 for these prime words whose prime factors are 
one, given that they satisfy 10. 03. 

Because of 10.01 and 10. 02, q and q’ can be written with identical first 
and second halves, 


q=S(q)CT(q) and =8(q)C’T(q), 
where (’ and C”’ belong to the same group; in fact, because of 10. 03, 
10. 04. 4C’ == i.e. C’ = 


Moreover, S(q) = S(q’) may be assumed to be identical with the first half 
of the first generator of g as well as q’, and T(q) =T7(q’) identical with the 
second half of the last generator of gq and q’. 


3 


and 
J»). 
(9) 
If 
i. 
ind 
nce 
ase 
me 
71) 
ore 
of 
its 
ont 
me 
rm 
en 
me 
of 
in 
ors 
rd. 
ors 

é 

| 


524 HANNA NEUMANN. 


Now we have to distinguish several cases: 


1. If S(q) =T(q)"t =Se, then both g and q’ can only be of the type 
9. 01, i.e. to = SoCSo", q’ == SoC’So", with C’ = 


2. If one of S(q)* and T(q)* equals So, the other Tc;, we may assume 
that S(q) =So, T(q) =To;. Then qg and q’ are either of the type 9.01 
with g = fo;, or of the type 9.02, but with the same generator fo;, as there 
is no other generator with second half 7c, and a eentral factor in the same 
group as that of fo,. If q and q’ were of different type, e.g. ¢ = far, 
q =tofo, with te=SoCoSo, then by 10.03: = tofer fort’, ie. 
q’ would have been reducible by means of the reduction (iii). Hence they 
are of the same type, and either g = q’ = for, or q = tofor, ¢’ = t’ofor, where 
because of 10. 04: 


as asserted. 


3. Both S(q)-* and T(q) are second halves of generators for, i.e. 
S(q) =Tor,", T(q) =Tor, Then q and q’ are of one of the types 9.03 
or 9. 04, with the same generators fo,, and fo;,. If q and q’ were of different 
type, it follows as before from 10.03, that the one of type 9.04 would be 
reducible (this time by means of the reduction (iv)). Hence they are both 
of the same type; if this is the type 9.03, the assertions on the longest gene- 
rators are obvious; if it is 9.04, we have g = for, tofor, and q’ = for,t’ofery 
hence by 10.04 C’ = = Ua = Ua Cor, 1.8. 
= (Cor, Ua Cor, 1) for the central factors of to and t’s. 
This completes the proof of 10. 0. 


10. 0 includes in particular the case where P = 1, and p and p’ are two 
reduced representations of the same prime word. Which shows, that in the 
case of prime words the longest generators involved in a reduced represen- 
tation are, essentially, unique. 

From this we want to deduce the corresponding result for any two 
reduced words which represent the same element of §: 


10.1. If P= Il Py and Q= II Qv are two reduced representations of 


the same element of and *,9'm and respectively the 
generators of maximal length 1 in the order in which they occur in P and Q, 


then 


in 
se 
we 

re 

ot 
| 
of 
Tey 

W 
the 

of 
cas 
tat 

Pe 

So 

in 
pri 

ger 
| 
pro 

for 
| 


GENERALIZED FREE PRODUCTS. 525 
1 m=<m’; 


2. if for some (lSpSm); gu=for, then gp; 

me 3. if for some gu=te=S8 then also 
01 ) =SoC’oSo', where C's differs from C, at most by left- and right-hand factors 
ere | in Ua or of the form Cor 'UgCor (a depends, in the usual manner, on the 


me set do) 


cde Proof. The proof proceeds by induction, using 10.0, i.e. the fact that 
ney | We know the truth of these assertions for prime words. To that end we 


ete | rewrite the representation 7.1 of a reduced word P ~J] Py as a product 

: 

of prime words of length 7, and words in shorter generators, in the following 
way : 

If the first factor P, is a word in shorter generators, hence P, a generator 

i.e, | of length 7, and if 1(P,P,) >1, we take P, as first factor of the modified 

representation ; in all other cases we take the longest partial product p, = II Py 


be | Which is of length 7 and therefore a prime word, as new first factor. If 
oth | then Px .1 is a word in shorter generators, p, and P»,., cancel less than half 


ne | of each other, as otherwise also [J Py would have been of length 1. In that 
1 
2 case, if 1(PagiiPnoiz) > 1, we take Pn... as next factor in the new represen- 


tation, while in all other cases we take again the longest partial product 


po= Il Py which is of length 7, and therefore a prime word, as next factor. 
Not1 


two | So we go on, until finally the original representation 7.1 of P appears written 
the | in the following form: 


k 
10.11. (pePx), where the pe are 
K=1 


prime words of length 1, and the Pe (x=0,:-+,k) words in shorter 
generators or the unit. Any two successive factors #1 (t.e. pe, Pr, or 
Pe, Pests tf Pe 13 and pe, pes, tf Pe=1) have the property that their 
sof product is longer than each factor (by 9.7% and the rules for obtaining the 
the} form 10.11). Moreover, if Pe #1, then the second prime factor of px and 
| Q,} the first prime factor of px. are both one (since, as a factor of the represen- 


tation 7.1, Pe was preceded and followed by a generator of length 1). 


526 HANNA NEUMANN. 


Now we write Q in the same way: 


10. 12. il (qeQx), 


Obviously, we may assume, that of all the different representations of 
this type for the same element of §, P (10.11) is one with minimal &. 
Then = k. 

As P and Q are representations of the same element of §, we have 


2 k’ 
10.13. IT (Pet pe?) Pot: 91 IT 


Here, all generators of length 1, hence all prime words on the right-hand 
side must cancel. Since at most the second half, i.e. the last [47] factors, 
of g, are cancelled from the right, at least the first {41} factors of g, must 
be cancelled from the left. We show that, thereby, also at least the last 
{41} factors of p,-* are cancelled, so that 


10. 14. Up Po <1. 


This is seen as follows: Since at most the first [41] factors of p,-* are 
cancelled from the left, the assertion is obvious in the case that P,Q) =—1. 

If Po*Q. ~ 1, then, by 9.7, at most half of it cancels on either side 
against less than [47] (if 1 is even), or against at most [41] (if J is odd) 
factors of p,-? and qi. In order that the first {47} factors of qg, should be 
cancelled from the left, PQ) must, therefore, be eaten up between p,~* and 
i, i.e. its complete first and second halves must cancel on either side, and 
its central factor (if not in 11) must cancel together with the two following 
factors on either side. Only then can the remaining factors of p,-' cancel 
the remaining of the first {41} factors of g,. But, as p,-’ and q, are of the 
same length 7, an equal number of factors of p,-' and q, (and, after the 
above, at least {41}) are*cancelled in the product p,Po'Qoq:, which proves 
10. 14. 

Therefore, p:-?(Po"Qo) qi = P with 1(P) <1; hence by 10.0, the asser- 
tions 1.-3. of 10.1 are true for the longest generators involved in the repre- 
sentations of p, and q, (both are reduced, by 7.3); and these are the same 
as the first m, (m,<8) generators of length 7 in the given reduced 
representations 7.1 of P and Q respectively. 

Besides, P is itself a word in shorter generators, and Pop, = Qoq.F". 
After multiplication from the left with (Pop,)"*, P = @Q becomes 


(ePs) — (PQ:) HI (4004). 


g=3 


of 
an 


are 


| 

t 

0 

a 

le 
0 
a 
t 
A 
Ww 

ge 

on 
as 


is of 
il k. 


GENERALIZED FREE PRODUCTS. 527 


Here the left-hand side is itself of the form 10.11; and so is the right-hand 
side after, possibly, PQ, has been combined with q. (viz. if the product PQ,q2 
is of length 7, and therefore itself a prime word). But the left-hand side 
involves only & —1 prime words of length /, so that an induction with respect 
to k completes the proof of 10. 1. 


11. Now at last we can consider the problem of a more careful selection 
of the generators in order to simplify the relations 8.0 as desired. 

As in 8, ¢o is chosen as the set of all elements of length zero in §, i.e. 
it consists of all elements V of the meet B=) U1. 

Assume that the sets ¢o of generators have been chosen for all ordinal 
numbers o <o. Then R, denotes again the group generated by all the 
generators of all the sets do (o’ <a). If R, is not yet the whole group §, 
the set do will be chosen as in 3, but with an additional rule to be observed: 


The first generator of ¢c has to be an element of minimal length / of 
§— Ro, and, if possible, a transform. Let f,f2,- - - be all the elements 
of 6—SR, which satisfy these first requirements (we choose this notation 
although they need not, of course, be denumerable), i.e. all transforms of 
length 7, if transforms of length 7 exist in § —Sz, otherwise all elements 
of length 7 in § — Re. 

Let f be any one of these elements, g any generator of a set do (0 <a) 
already chosen. Then with f also gf belongs to © — Ro, hence I(gf) = 1 = I(f). 
We now consider all those earlier generators go = to or = fo, (but not of 
the form fo,-1) with the property that 


11. 00. l(gof) 


All generators with this property are of length </. For if there were one 
of the same length / as f. 9’ in do say, then, by 3. 12, f would belong to Reru1, 
and a fortiori to Ro. ; 


11.01. Put Ay = Max 1(go-) 


where go varies over all generators satisfying 11.00. Then A;</. If no 
generator go satisfying 11.00 exists, we put A; 0. Finally, we put 


11.02. Ac=—Maxd;, where f varies over all elements 
In the case that these elements f,, f2,- - - are transforms, we choose any 
one for which Ay is maximal, i.e. Ay —Ac, as first generator tc of do, and, 


as in 8, we add all transforms t’s with I(t’otc) S1 to go. Lue. to, 
are chosen so that: 


land 

ors, 

ust 

last 

are 

1, 
side 
dd) 
1 be 
and 
and 

cel 
the 
yves 

ser- 


528 HANNA NEUMANN. 


11.10. There exists an earlier generator go: of length Ao so that 
I(to) 


11.11. if also I(go'tc) =1 for some earlier generator gov = to» or 
= forr, then go) S do; 


11.12. If t is any other transform of length 1 in § —Ro, and for some 
generator = tos or =1, then I(go+) S Xo. 


If outside the group generated by Ro and the elements to, t’c,: - - there 
are still elements f of length 7 such that /(f“tc) 1, we have to choose the 
next generator fo,, of ¢o from these elements f. Here again, we denote for 
every such f by A;,, the length of a longest generator go. =to or = fo; 
(but such that 


l(fgor) =I. 


Then again A;,, <1 for all the elements f. Again we put 
Ao,1 Max Af,1s 
f 


and choose as next generator fc, of da an element f such that A;,, is maximal, 
i.e. Then the generator besides satisfying 11. 10-11. 12 
(because it has the same first half and a central factor out of the same group, 
as ta) has the property that: 


11.20. There exists an earlier generator go 1 of length ro, so that 
U( forgo’) =1=1(fo,) 
11.21. «tf also =I for some earlier generator gov or 
= then 


11.22. «if f is any other element of length 1 in §—8R, such that 
S 1, and if for some earlier generator = tos or = =, 
then 1(go+) S 


So we go on: If all the generators to, for for all r’< + have been chosen, 
and there are still elements f of length 7 outside the group generated by Rs 
and all these generators to and fo, such that 1(f-%te) <1, we choose from 


18 Not normally the same as in 11.10, of course. 


| 

| 

t 
| 
a 
tl 
5 
8 
07 
ge 
t 
tr 


or 


ine 


1al, 
12 
up, 


or 


GENERALIZED FREE PRODUCTS. 529 


these for such that the length Ao,, of a longest generator ga with 1(fgo') = l, 
is as great as possible. Again, 11. 10-11. 12 holds also for fa,-. 

In the other case where § — Rc does not contain transforms of length 1, 
we choose the generators correspondingly : first we see to it that A, is maximal, 
ie. fo,1, and all further generators of the set, will have to be chosen from 
those elements of length / of § —W&c, for which the length A, of a longest 
earlier generator go with the property 1(go°*fo,,1) =J, is maximal. Then we 
satisfy, step by step, the conditions that Ao; (defined as above) is as great 
as compatible with the earlier choices. Again, Ao is the same for all gene- 
rators for Of $c, i.e. given by 11. 10-11.12 (with fo, in place of ta). 

The gist of it is: In every step we see to it that as many factors as 
possible, both of the first, and of the inverse of the second half, of each 
generator are identical with the first half of some earlier generator. That 
this choice ensures the desired simplifications of the relations 8.0, we are 
going to show in the next paragraph. 

We keep for these improved generators the old notation: ga, ete. 


12. All properties and results derived earlier for the generators of the 
system ¢ hold, of course, for our present generators. But the former result 
5.2 can now be improved as follows: 


12.0. Jf g is a generator of length l, P a word in shorter generators 
such that 1(Pg) =1 or I(gP) =1, then P=p is a prime word of length 
l, <1 with the property that there exists a generator go, of length 1, go, = to, 
or = fo,r, so that: I(pgo,) S|, or (go, *p) Sl, respectively. 


Before we prove 12.0, let us consider the consequences: It follows 
immediately that the earlier result 6. 2 can now be replaced by the following: 


12.1. If gi and gy are generators of equal length 1, P a word in, shorter 
generators, such that 1(giPgx) Sl, then gi and g;, belong to the same set dz, 
l(gigx) Sl, and P = 7 is a prime U-transform of length l, < 1 which is itself 
a generator to (o of length 1,. 


Proof. By 6.2, P= is a prime U-transform of length J, < 1, and 
l(gir) =1(7gx) =1. Hence, using 12.0, there exists a generator go of 
the form go =to or = for of length J,, such that I(rgo') S1,; as is a 
transform, also <1,. Therefore, by 9. 62, 7 = to. 


In particular, 6.3 is now improved to: 


12.2. If under the same assumptions as in 12.1, l(giPgr) <1, then 


= 
ere 
he 
for 
= 
hat 
l, 
en, 
Ro 
om 


530 HANNA NEUMANN. 


either gx=t'o, or gi=for, And then 
where P=2—=—to, P =to (0 <a, <a). 


And from this follows that the relations 8.0 assume the following sim- 
plified form for our present generators: 


12.3. THEOREM. Every relation between the generators of the system 
@ (consisting of the sets do (O0Sa<ao) of generators to and fo, 
(1S1r<10)) follows from relations of the following two types: 


12.31. [[ to’ =te, where to. =SoU,8,* is a generator of a set du 
p=1 
with <a; 
12.32. for*tofor =to where to to» are generators of the 
form to = So(CorU So" and tov = Tor UgT or with o’ 
<a. 


If the set ¢c does not contain generators to, then we have in 12.32 
necessarily < I(for), ie. o <a. 

We denote this system of defining relations 12.31 and 12.32 for the 
generators of the system ¢ by 9. 


It now remains to prove 12.0: 


Proof of 12.0. That under the assumptions of the theorem P = p is a 
prime word of length 1, <1, we know by 5.2. All we have to prove is the 
additional property of the prime word p which can be expressed briefly as 
follows: If there is any generator g longer than the prime word p, such that 
the product pg is of the same length as this generator, then there is also 
a generator g, of the same length as the prime word and beginning with a 
half So such that their product pg, is of the.same length as this generator 9). 
And correspondingly for the generator as left-hand factor. 


12. 0 is trivially true for all generators g of the set $» and all sets whose 
generators are of length one. Hence we may proceed by a transfinite induc- 
tion, i.e. assume that 12.0 is true for all products pg and gp where g belongs 
to a set do with o’ <o. We then have to prove it for all such products with 


a generator g of ¢o. 
Obviously it is sufficient to prove it for one of the two cases, e. g. (pg) = I. 
The other case reduces to this one if one considers the inverse p-'g-' which f 


is also of length /. 
First let us assume that g is of the form g —t, or g = for, i.e. that g 


| 
i 
a 
I 
le 
Ir 
ar 
fo 
tre 
wl 
A 
sa 
Be 
fro 
gel 
11. 
the 


n- 


1e 


GENERALIZED FREE PRODUCTS. 531 


begins with Sco. In that case we may even assume that g is the first generator 
of since, by 9.8, /(pgi) =1(gi) for any generator gi = to or for of do 
implies the same for the first generator of go, and vice versa. 

Now let = to or = for be a longest generator such that’ I(go-*g) = 1, 
then 1(go-) = Ac, where A, is defined by 11. 10-11.12. Also, of all the prime 
words shorter than g whose product with g as right-hand factor is of the 
same length / as g, let q be one of maximal length. Then we show: 


12. 40. 1(q) = Av. 


Proof. As 1(go-1g) =1, and I(ga*) =Ac, and since certainly is 
a prime word shorter than J, viz. of length Ac, we certainly have 1(q) = Ac. 
If q were longer than Ao i.e. 1(q) =A > Ao, we distinguish two cases: 


Hither g has the property that there exist a generator gg” of the same 
length A as q, and of the form go» =t,» or for, such that 1(qgo-) SX. 
In that case S1(q) and l(qg) =1(g), hence by 2.11: 
<1(g). But by 5.0: so that =I1(g). But 
I(go) =A > Ao, which is a contradiction to 11. 11. 

Or no such generator go” of the same length A as q exists. Then, by 9.1 
and 9.2, qg is either of the form 9.11 with gq. 1, or of one of the forms 
9.12 or 9.13, where again qg241 unless q ends on a term of the form 
forrg2, When the prime factor qg, may, or may not, be one. Again we have to 
treat these cases separately : 


If q is of the form or Y=Qiforr Gz, Le. 
where go” begins with and we put g=qg. By 9.9 1(9) 
=1(qg) =1; and also 


12. 41. l(gog) =1(g) =1, where 1(go") =A > do. 


And, should g be a transform, fo, then also ¢ = q.gq."' is a transform of the 


same length, and also 
12. 42. gor #) with =A> do. 


Besides, with g, also g or # respectively lie in §&—WSs, since they differ 
from g only by factors which are shorter than g, hence expressible by earlier 
generators. But then, 12.41 or 12.42 respectively is a contradiction to 
11.12. Hence this case is not possible. 

If g is of one of the forms 9.12 or 9.13, and ends on a term for: qo, 


then, again by 9.9, with 1(qg) =1(g) =1 also g) = 1 and U(q2g) =I. 


? 
m 
OT 
du’ 
he 
32 
a 
e 
AS 
at 
SO 
a 
1: 
se 
C- 
oS 
h 
l. 
h 


532 HANNA NEUMANN. 


Hence, if we put = g, if g is no transform, or tf = 9 fort, 
if g=t, is a transform, then 9) =1=1(g), or =1(#) 
respectively ; besides g, or ?, are elements of length 1 of §& — Ro, and we have 
the same contradiction to 11.12 as before, as also I(fo»r) =A >Ac. Hence 
1(q) > Ac is impossible, and 12. 40 is proved: 1(q) =Ac=1(go). 

But then it follows from 1(go'g) =1(g) and 1(qg) =I1(g), by 2.12, 
that 1(qgo') S1(go') = Ac, which proves 12.0 for prime words of maximal 
length such that 1(qg) =1(qg). 

If now p is any other prime word such that also 1(pg) =I1(g), then 
l(p) < Ao =1(q), and again by 2. 12, it follows from Il(pg) =1=I1(g) and 
l(qg) =l=1(g) that I(pq*) S1(q) =Ac. As also 1(qgo’) =1(go’) = dv, 
where go has the same meaning as before, we have by 2.11: 1(pgo') S1(go’), 
hence, because of 5.0: l(pgo') =I1(go'). But go is a generator of an earlier 
set do, for which we assumed 12.0 to be true; i.e. there exists a generator 
Jo = to or = for of the same length as p such that l(pgo) S1(p), q.e.4. 

The proof in the case that g is of the form g = fo,"! is strictly analogous, 
using a generator go of maximal length Ac; with the property /(forgo’) 
= I1(go"1g) =I, as before. We shall not repeat it. 


13. So far we have shown that any subgroup © of the generalized free 
product @ can be generated by the system ¢ of generators (consisting of the 
sets do defined in 3, 11), whose defining relations are given by 12.3. § isa 
subgroup of 11 if, and only if, ¢ consists of the one set $9 only. In that case 
the relations 12.3 reduce to those of the form 12.31, i.e. simply to the 
relations in the subgroup % of 11; and about those, they say nothing at all. 
In other words, our results give no information whatsoever on the subgroups 
of the ‘smallest’ generalized free product U. 

If @ is larger than U1, and the subgroup § of G not contained in U, 
it now remains to interpret the meaning of these relations. 


13.0. THroreM. Every subgroup § of &, not contained in U, ts a 


generalized free product. 


13.01. Its factors are (i) subgroups So of conjugates of the groups &a, 
(ii) subgroups $*o of conjugates of U, (iii) groups which we denote by Sor; 
in Sor the amalgamated subgroups generate a group which 1s self-conjugate 
with infinite cyclical factor group. 


13.02. All amalgamated subgroups are contained in groups generated 
by subgroups of conjugates of the groups Ug. 


( 
t 
t 
t 
te 
a 
be 
g 
| 
th 
f 
if 
as 


GENERALIZED FREE PRODUCTS. 533 


Proof. In conjunction with every single set ¢o of the system @ of 
generators, we define the following subgroups of 9: 


1. We consider first the case that the generators of ¢o are of odd length l. 
Let their central factors belong to Ga — Ua. 


(a) If ¢o contains generators ts = SoaCgSo',- - -, let So be the sub- 
group of § which is generated by all these generators to, t’c,- - -. So contains 
all prime 1-transforms zo of the form rg—=SoU,So". For with te, also 
toto = t’o is a generator of go. These elements zo form a subgroup of § 
which we denote by Zo. 

Moreover, if for = SoCo,To; is any other generator of ¢o (i.e. Cor in 
6. —U.), and if So contains generators of the form 


te =— So, 


then let So” be the subgroup of So which is generated by all these elements su’. 

Finally, we denote by Bo the subgroup formed by all elements of §o 
which are transforms of an element of l,. All the groups Go’, and Zo, are 
then subgroups of Bo. 


(b) Next, whether ¢o contains generators to or not, we define for every 
generator for = SoCo,T'o, of $, the following subgroups of §: 


Let Go, be the group of all elements so, of § which can be written in 
the form So(CorUaCo7 1) So, i.e. SI and if no generators 
ts exist in do, then certainly I(sor) <1. Let Zo, be the group of all elements 
to, of § which can be written in the form ter = To;1U qT ar, i.e. I(tor) <1. 

Finally, we denote by Bo, the subgroup of § which is generated by Go, 
and &o,, and by %e, the group generated by ¥,, and for. 


2. If the generators of ¢o are of even length /, their central factors 
belong to Ul. I.e. in the case of a free product with one amalgamated sub- 
group Ul, there are certainly no transforms in ¢c. Here the subgroups 
belonging to the generators for of ¢, are defined exactly as in 1.(b); and in 
this case I(so,) <1 by necessity. 

In the case of a generalized free product, the definitions are strictly 
analogous to 1.(a) and (b). 

If ¢, consists of transforms te = ScUcSo (possibly) and generators 
for = SoU or, we denote by the subgroup of § generated by all to, 
if these exist in dc. It is a subgroup of a conjugate of U. Its subgroups Zu, 
and Bo, as well as and its subgroups Tor, and Bo; are defined 
as in 1., always with Uc; in place of Co,, the significance of the suffixes of the 


1 
t) 
ve 
ice 
2, 
al 
en 
nd 
‘), 
ler 
or 
d. 
8, 
a’) 
ree 
he 
sa 
ase 
he 
all. 
1ps 
U, 
Sa, 
oT; 
ate 
ted 


534 HANNA NEUMANN. 


groups U, which go into the definitions obtainable from 6.31 (this time 
from 6.31 (1i)). 

Obviously, § is generated by all the groups $c, *, and %or (+ =71(c)). 
[In the case of one amalgamated subgroup U, the groups $*, do not occur. ] 
Therefore the system ¢’ of all the elements to and zo of all the groups > 
and $*o, and all the elements so;, tor, and fo, for all o and r—r(c), is 
itself a system of generators for §. ¢’ includes the original generators t, 
and fo, of ¢. As regards the added generators zo, ser, and tor, it is obvious 
from their definition and 6.31, that the wo are exactly the prime U-trans- 
forms which occur in the relations 8.01, and the sc, and to, are exactly the 
prime U-transforms which occur on the left- and right-hand side respectively 
of the relations 8.02. But in 12 we found that the system ¢ can be taken 
so that these prime U-transforms are themselves generators: the system 
of relations given in 12.3 enables us to express the new generators of the 
system ¢’ by means of generators of the original system ¢: 


13.12. if —1(fer), 

= ta with o’ <<a, if I(sor) <U(for); 


By means of 13. 11-13. 13, the relations 12.31 and 12. 32 of the system ¢ 
become : 


13. 21. II to = 70, 
p=1 
13, 22. tow’ 


and these are now relations between the generators of one and the same 
group, viz. a group §o or §*z, or a group Sor. 

We denote by Rt’ the system of all the relations 13. 11-13. 13 and 13. 21- 
13. 22 between the generators of the system ¢’. ¢’ with the defining relations 
R’ is equivalent to @ with the defining relations Rt, i.e. § can. also be 
described by means of ¢’ and §’. 


If we denote by g, $y,° - - unspecified members of the system of all } 


the subgroups and of §, then the relations of are of two 
different types: those of the form 13.21 or 13.22 are relations between 


generators of one and the same group §g,, while the relations of the form 


13. 11-13. 13 are relations which identify an element of a group g, with 
an element of a group $, (By). Now let h be any element of the meet 


( 
| 
| 
| 
| 
| 
| 
| 
| 
q 


time 


sur. | 
3 So 
), is 
to 
‘ious 
ans- 
the 
vely 
ken 
a 
the 


me 


GENERALIZED FREE PRODUCTS. 535 


of Sg and §,. If, for the moment, we denote the generators of the system ¢’ 
which belong to Sg and §, respectively, by gg and gy, we have h —h(gg) 
=h(g,). This is a relation between generators of the system ¢’, and follows, 
therefore, from the relations in ’. Hence we may add this relation, and in 
the same way any other relation which expresses that an element belongs to 
the meet of two groups Sg and §,, to the system 9’, and the enlarged 
system will still be a system of defining relations for the generators to, 70, 
sor, tor, for Of S. But now this system of relations is of the kind described 
in I. 2.0; hence § is the generalized free product of all the groups $o, $*o, 
and 
Next we prove 13.02 by showing: 


13.3. The amalgamated subgroups are subgroups of Bo in the case of 
a factor So or $*o, and of Bor in the case of a factor Sor. 


By means of the relations 13. 11-13.13, the reductions (i)-(vi) of 7 
can now be based on the relations 13.21 and 13.22 instead of the relations 
8.01 and 8.02. If then an element A of a factor Sg of § is somehow 
represented by means of the generators gg of ¢’ which belong to §g, then 
also the reduced representation will involve generators of this group. §g only. 

Now let h be an element of the meet of §g and $y. If one of these 
groups, §g say, is of the type So or $*o, then h itself, as element of this 
group, is a generator fo (o’ So), i.e. a reduced word which is unique by 
10.1. Hence, any reduction of the representation of h in §, must lead to 
the same reduced representation, but for application of a relation of the type 
13. 11-13. 12. This proves 13.3 in this case. 

If both Sg and §, are groups %o, and Yor respectively, then a repre- 
sentation of h in either group contains apart from generators which are l- 
transforms, and therefore of the form te, only the generators for and for 
respectively, and in each representation, this is a generator of -maximal 
length. As forfo7, it follows from 10.1 that a reduced representation 
of h in either group cannot contain fo; or for any more, i.e. h is expressible 
by means of the generators so; and to;, and soy and to, respectively. From 
which 13.3 follows also in this case. 

In the case of a factor %o,, the group génerated by all its amalgamated 
subgroups is, in fact, exactly Bo, For, by 13.12 and 13.13, Gor is amal- 
gamated with the subgroup Go’ of $4 or $*o, if elements so, of length 
\(for) exist in Go,, otherwise with a subgroup of some earlier group $o 
or $*o; and Zo, is amalgamated with a subgroup of some group Qo or 
§*o (o” <a). Moreover, the relations 13.21 show that for transforms So, 


a 
1- 
ns 
be 
all 
vo 
en 
m 
h 
et 


536 HANNA NEUMANN. 


into Zo;, i.e. Bor is self-conjugate in }o,, and its factor group is the infinite 
cycle generated by fo;. This completes the proof of 13. 0. 

In the case of a group §o or $*c, we know for certain only that the 
subgroups To and Go" are amalgamated subgroups: by 13.11, To is amal- 
gamated with a subgroup of some earlier factor Oo of $*,-, and each group 
So7 is amalgamated with the subgroup Go, of the corresponding factor %o;. 
But these subgroups need not generate the whole of Bo, which may contain 
other subgroups of conjugates of certain groups Ug, and these may, or may 
not, be amalgamated with subgroups of later groups or O*5 

All these results are the same in the case of a free product with one 
amalgamated subgroup, but for the fact, that the factors *, which are sub- 
groups of conjugates of 1 in the general case, do not occur in this simpler 
case. In particular it follows: 


13.4. TuHerorem.’* Jf (i) G is a free product with one amalgamated 
subgroup U, and (ii) U is self-conjugate in G, then any subgroup § of © 
is the free product of all the groups So and Sor with the one amalgamated 
subgroup B= HU. (For the meaning of So and For cf. 13. 01.) 


Proof. U ‘is self-conjugate in © if, and only if, U is self-conjugate in 
every factor G, of G.%° Therefore all the groups Te, So’, Bo, Sor, and Tor 
are identical with the meet B of © and ll, as any one of these groups consists 
of all elements of § which are certain transforms of elements of U. And 
conversely, every element V of ¥ can be written as a transform V = SUS" 
for any given S. 


14. Finally we investigate the relation of the factors So and §*o to 
the given subgroup . We need several preparatory lemmas. 


14.0. If K is any element of § which is a transform, then K = Q"'toQ 
with Q in §. 


Proof. We use again the reduced representation of K written in the 
form 10.11: if the longest generators are of length /, then 


k 
K = (peP«), i, 
1 


where the pe (x =1,---,) are prime-words of length 7, and the Pr 


14 Cf. Kurosch and Kalashnikov, [2], for a special case of 13. 4. 
15 Actually it can be seen without much trouble that the condition (i) of the 


theorem follows from (ii) provided that @ > I. 


in 


( 
fi 
( 

le 
B 
ol 
B 

p 
pl 
he 
1¢ 
tr 

| 
fo 
ca 

| 
[2 
by 

t 


1€ 


GENERALIZED FREE PRODUCTS. 


(x = 0.:-+,) words in shorter generators, or any two successive 
factors 1 have the property that their product is longer than each factor 
(Cf. 10). 


Moreover, we may assume that Py) —1; for otherwise we transform with 
P, which is an element of §; hence, if the lemma is true for the transform 
P,1KP,, then also for K. 

If k=1: K=pP. If here then /(pP) >1(p) and 
\(pP) >1(P). As K is a transform, we have: l(pP:pP) < 1(pP), i.e. at 
least the first half of the second factor pP is cancelled by the first factor pP. 
But because of 1(pP) > 1(p), this means that at least the first[47] + 1 factors 
of the second p are cancelled by the first factor pP, i.e. we also have 

l(pPp) <l(pP), i.e. (Kp) <U(K). 

But then either P alone cancels all there is cancelled of the right-hand factor 
p in this product, i.e. 1(Pp) <1(P) contrary to 9.7; or else, P must be 
eaten up between the two factors p, so that the first p can then cancel the 
remaining factors of the second which are cancelled in the whole product 
pPp. But in pP more factors of P are untouched than are cancelled by p; 
hence, in Pp, more factors of P must cancel than remain unaffected, i.e. 
< I(p), again in contradiction to 9.7. Hence P=1, K~=p is a 
prime word of length /, and 14.0 follows from 9.1, 9.4, and 12. 1. 

We assume 14.0 to be true for all those transforms whose representation 


k 
10.11 contains k’ (1=k’ <k) prime words. Then let K =[][ (pxPx) bea 
transform whose representation 10.11 contains k prime words px of length 1. 
Again we know, that in this representation at least the first [$7] +1 
k 
factors of p, are untouched in the whole product [J (pxP«), and therefore 
1 


are identical with as many factors of the first half of K. Since K is a trans- 
form, at least these [47] + 1 factors of p, cancel in Kp,. Again, as in the 
case k} = 1, one deduces from this that P; = 1. But then these first [47] +1 
factors of p,, which cancel in the product Kp,, must do so against the last 
[37] + 1 factors of p; (because these also are untouched in the whole product 
K = p,P,- + Psp); hence we have <1, so that is expressible 
by generators which are all shorter than /, and 


k-2 
pit Kp, = (prPx) 


involves only k —2 prime words of length 7. By induction: 


pit Kp, = e. K 


ite 
he 
al- 
Pp 
OT. 
in 
ay 
ne 
b- 
er 
od 
G 
ed 
in 
OT 
ts 
d 
| | 
| 
K 


538 HANNA NEUMANN. | 


where with Q, also Q = Q,p,"' is an element of ©. = | 
14.1. Let K be again a transform in §, and Q,Q:,Q2,: °° all the (K 
elements of § such that 
where to, t’c, t’o,- - - all belong to the same set oc. If Q is one such element | 
of minimal length, then 1(teQ) =1(Q). 


Proof. If we had 1(teQ) < 1(Q), we put Q = tcQ; then = 
= K; but @ is shorter than Q contrary to the assumption that Q is of We 
minimal length. 

If we put to =SoCoSo' and K=S(K)C(K)T(K) where 


Q is again as short as possible, then Q-' cancels at most So, and cancels and "i 
.| tedu 
amalgamates at most SoCs. Hence: 
e[t- 

14.11. S(K) =Q'SoC, where C belongs to the same group as C(K); 
14.12. Hence: C(K) =C" Col, i.e. Co belongs to the same group | 
wre | 

as C(K). 

From this we conclude: 
14.21. If K and K’ are two transforms of the same length in , and (wh 
gene 


KK’) =1(K), then K =Q>teQ, K’ = where toe and t's belong 
to the same set do. and 


to 01 
Proof. K and K’ may be assumed to have identical first halves, and their 
central factors belong to the same group. By 14.0, we have K = QtcQ 
where we may assume () to be of minimal length; hence, by 14. 11 and 14. 12: 

S(K) C(K) = C7000, 


and therefore S$ (K’) =Q>SoC, C(K’) =C(K)C’ (C” in the same group as | there 
C(K)). Hence: K’ = = QoQ. 

The product K, = KK’, if shorter than K, is a prime U-transform. And _ ¢ 
because of 14.21, K, —Q-'(tot’c)Q, which is a prime U-transform if, and #. 
only if, tot’s = 20 = SoU So. If, on the other hand, K, and K are two 
transforms, 1(K,) <1(K), such that 1(K,K) =1(K), then K’=K,K is 2 | 
transform of the same length as K, satisfying the assumption of 14.21. Hence, In tk 
14. 21 applied to K,K and K gives at once: | cours 


14.22. Jf K and K, are two transforms in §, 1(K,) S1(K), such that ordin 


1(K,K) =1(K), then K,=Q"t’oQ, K=Q>toQ, where o’ =o if 1(Ki) | 
subgt 


| 
| 


| GENERALIZED FREE PRODUCTS. 539 


(=1(K); and o’ <a, but to = to in the same group So, or as to, tf 
<1(K). 


ve 
Finally we show that the set ¢o of the generator to in 14. 0 is, in general, 
uniquely determined by the transform K: 
it 14.3. If K is a transform in §, but not a U-transform, and K = Q-teQ 


then o =o’. 


Proof. If 14.3 were not true, we may assume o’ < i.e. SI (to). 
yf We have, with QQ,'—Q: 


Here Q Ax—=SoU,So', since otherwise oo’. We also assume Q to be a 
| teduced word. to is certainly reduced, hence unique by 10.1. Reducing the 
i | eft-hand side Q-'teQ must, therefore, lead to the same result to. 
; If the longest generators in Q are shorter than fo, it follows immediately 
{| chat 1(tc) = 1(to-), and then oc =o’ by 10.1. If the longest generators in Q 
lire at least of length /(¢c), all of them must drop out in the course of the 
» |» eduction. But Q, and its formal inverse Q-', are reduced. Hence a reduction 
)!:an arise only from ¢e in conjunction with the partial product Qog, of @ 
q | (where g, is the first generator of maximal length, and Q, a word in shorter 
generators or the unit element), i.e. gi *Qo‘*toQog, must be shorter than 9, 
and therefore Qo"‘tcQ, a prime ll-transform, i.e. ¢, a U-transform, contrary 
to our assumption. 


r 
) 14. 0-14. 3 together give immediately: 


14.41. THrEoremM. Jf S*G,S is a conjugate of one of the factors ®a, 
$s = § 1) (S7G,8) its meet with §, then Sg is conjugate in § to a group 
Bs = with Q in Ho ts uniquely determined, unless $s, and 

s ‘therefore So, is contained in a conjugate of a group Ug. , 


14.42. THroreM. Jf S7“US is a conjugate of U in G, H*s 
=§ (SUS) its meet with then $*s is conjugate in § to a group 
§*o, = with Q in H*o is uniquely determined, unless $*s, 

sj nd therefore *o, is a subgroup of a conjugate of a group Us. 


» |In the case of a free product with one amalgamated subgroup, 14. 42 is, of 
‘course, irrelevant. These results again generalize Kurosch’s results for 


ordinary free products. 


) 15. If G is the free product of the factors @ with one amalgamated 
subgroup 1, the essential points of our results are briefly these: 


540 HANNA NEUMANN. 


ivery subgroup § of @ is a generalized free product. The amalgamated 
subgroups are all contained in conjugates of 11; the factors are of two different 
kinds: Either they are subgroups of conjugates of the factors G, of G; these 
are, in general, uniquely determined, but for transformation in §. Or they 
are of the type %o,: the amalgamated subgroups of such a factor generate a 
self-conjugate subgroup of jo, whose factor group is an infinite cycle. 

These results, for the free product with one amalgamated subgroup, first 
led me to introduce the generalized free product. But the method used to 
investigate the simpler case yielded the corresponding results for the subgroups 
of a generalized free product @ with amalgamated subgroups Uag—as long 
as the subgroup is not contained in the smallest free product U. In this 
case, the factors of a subgroup § are partly of the same types as before: 
factors %o,, and factors which are subgroups of conjugates of the @,. But 
in addition, we may have factors which are subgroups of conjugates of UL. 
The amalgamated subgroups are subgroups of conjugates of the Ug, i.e. the 
groups generated in each G, by the Ugg (a fixed), which together generate U. 

Now in @, U together with any one group @, generates a group which 
is the free product of 11 and G, with the one amalgamated subgroup Ukq. 
It is this property of @, rather than the full fact that G is the generalized 
free product of the groups G, with amalgamated U,g, which is reflected in 
our results on the subgroups § of &. This is probably not due to the method 
only. The examples in Part I (8) gave us some indication that genuine 
difficulties are likely to arise in 11. In this part (8), we came across one of 


the reasons: an element of Ul may transform an element of one factor U, 
into an element of another factor Ug, even if none of these elements belongs 
to the meet of U1, and Ug. It seems that the subgroups of WU cannot, in| 
general, be represented as generalized free products derived from the corre-| 
sponding representation of 1. But we are not going to follow up this 
problem here. 


UNIVERSITY COLLEGE, HULL. 


REFERENCES. | 


[1] A. Kurosch, Mathematische Annalen, vol. 109 (1934), pp. 647-660. ‘ 
[2] A. Kurosch and V. Kalaschnikov, Comptes Rendus de VAcadémie des Sciences, © 
URSS, vol. 1 (1935), pp. 285, 286 (quoted from: Zentralblatt fiir Mathe- d 
matik, vol. 11 (1935), p. 151). 
[3] B. H. Neumann, Journal of the London Mathematical Society, vol. 18 (1943), pp. 


12-20. 


t 
sc 
li 
Cé 
c 
( 
n 
ni 
( 
( 
A 
W 
Tre 
t 
Se 
a 
fo 


SERIES TO SERIES TRANSFORMATIONS AND ANALYTIC 
CONTINUATION BY MATRIX METHODS.* 


By P. VERMES. 


1. Introduction. Several methods are known for the generalized 
summation by matrices of a series U+u,+-:-- with partial sums 
Sk = Up + Ur +--+ ux. The standard method is the transformation of 
the sequence sx by a matrix F=(fnx) into a convergent sequence on, 
so that on—= > finxs:. The method is called regular if the existence of 

k 


lim s; implies the existence of limon, and both limits are equal. We shall 
call the matrix of a regular sequence to sequence method a T-matriz. Sutffi- 
cient and necessary conditions for a matrix to be a T-matrix are well known. 
(Dienes 389.) * 
Another method is the transformation of the series } u;, into a convergent 
sequence on by a matrix G= (gn), so that on = S Jnxux. We shall call a 


matrix of a regular series to sequence method a y-matriz. Sufficient and 
necessary conditions for G to be a y-matrix are (Dienes 396) : 


(1.1) | — | = M for n—0,1,2,-- -, 
k 
(1. 2) > 1 as n—> for every fixed —0,1,2,---. 


The third method is the transformation of the series > ux by the matrix 
A= (an) into a convergent series } Vn, so that vn =X ance. In section 2 


we shall give sufficient and necessary conditions for ‘the matrix A to be 
regular, in which case we shall call it an a-matriz. We also prove that the 
product of a y-matrix and an a-matrix is a y-matrix, and the product of 
two a-matrices is an a-matrix. 

In section 8 we obtain the general expression for the matrix of the 
series to series transformation representing the continuation of a Taylor 
series. Such matrices form a one parameter family of upper-semi-matrices 


= \e-"(1—A)*. The corresponding sequence to sequence trans- 
n 


formation matrices are F = (1—A)A. The product of matrices of the 


* Received August 26, 1947. 
1 References are given at the end of this paper. 


ted 
ese 
ney 
ea 
rst 
to 
ps 
ng 
his 
re: 
U. 
the 
U. 
ich 
zed 
od 
ine 
1gs 
in | 
ccs, 
he- 
541 


542 P. VERMES 


same family is commutative and associative and belongs to the family, 
Regularity, translative properties, equivalence of A- and F-methods depend 
on the parameter. There are also investigations on the field of efficiency 
for the series } z* and > c,z* for simple and double transformations and for 
the inverses of the methods. 

In section 4 we discuss the connection between the matrices of section 8 
and those of the Euler summation methods E(r) and €(r), elaborated 
recently by Agnew. The Euler matrices are transposed matrices of A and F. 
If A is an a-matrix its transpose is an Euler 7-matrix, and if F is a 7-matrix 
its transpose is an Euler a-matrix. 

In section 5 we obtain the matrix A of the series to series trans- 
formation representing the continuation of the Laurent series. The result 


is a one parameter family of matrices ay, = ( ) — A)", with 


corresponding matrices for sequence to sequence transformation (1---A)A/A. 
The properties of the matrices are investigated for various values of the 
parameter, and the field of efficiency for the series }z* and Sc,z*. For 
certain values of A the efficiency of the summation extends to values of z 
which are not in the principal star domain. 


2. Series to series transformations, a-matrices. The series wu. + 


+--+: is summable by the matrix A to the sum s if 
(2.1) Un = D Anke exists for -, 
k=0 
and 
oO 
(2. 2) >> Un = 
n=0 


Writing v) +; (2.1) and (2.2) are equivalent to 
on—>sasn—>o. But 


n x oe n 
(2. 3) ( jx) Ur, 
j=0 k=9 k=0  j=0 
and writing 
(2. 4) + Aik Onk = 


we have 


fo 
(2. 5) InkUx, 
k=0 


is ¢ 


Ww 
fo 
an 
(2 
80 
pa 
It 
(2 
(2 
Co 
su 
(2 
If 
(2. 
| 
(2. 


to 


ANALYTIC CONTINUATION BY MATRIX METHODS. 543 


where the existence of (2.1) implies the existence of (2.5). Conversely it 
follows from (2.4) that 
(2. 6) Aok = Joks Ank == Jnk — Jn-1k (n=1, 2," 
and if on, defined by (2.5), exists for n=0,1,-- -, 

90 
(2. 7) Vn = = (9nk — Jn-1 k) Ux = On — On-1) 

k=0 k=0 
so that the existence of (2.5) implies that of (2.1). Hence 

2.1. If the infinite matrices A and G are connected by the relation (2.6) 

or (2.4), then the series to series transformation by A and the series to 
sequence transformation by G are equivalent, i.e. the G-transforms are the 


partial sums of the series obtained by A-transformation. 


2.11. The matrix A= (dx) is an a-matrix if and only if G, given by 


nic = Aon + Anz, a y-matriz. 


It follows from (1.1), 2.II., (2.6) and (1.2) that 


(2. 8) | — | SM for n—0,1,---, 
k=0 
. 
(2. 9) = 1 for every k, and hence lim = 0. 
n=0 n-—>00 


Conditions (2.8) and (2.9) are satisfied hy every «-matrix. They-are not 


sufficient, as shown by the example 


(2. 10) = 1 for n= —1 fork <n< 2k, 0 for n= 2k. 


If G is the lower semi-y-matrix of ordinary convergence 


(2. 11) 
the unit matrix is the corresponding a-matrix. 
2.1II. The product GA of a y-matrix G and an a-matrix A exists and 
is a y-matriz. 


Proof. It follows from (2.9) and from the regularity of G that 


(2. 12) h nk >> 


ily, 

and 

ney 

for 

n 3 

ted 

F, 

Trix 

ns- 

ult 

ith 

/X. 

|) 

| 

2 

U, 


544 P. VERMES. 


exists for every fixed n and k, and that 
(2. 13) hnx—> 1 as n—> for every fixed k. 


Also writing @ox + dix +: > + = Dax, B is a y-matrix, and 


= (Bix — dj-1%) (Inj + lim 
j=0 j=0 


where b..,—0. It follows from (1.1) that, as j7-—> 0, gn; tends to a finite 
limit, and from (1.2) that bj,-—> 1, hence 


(2. 14) | — | S MyM, for every n. 
k=0 


Thus, by (2.12), H = GA exists, and, by (2. 14) and (2.13) it is a y-matrix, 


Note. The product AG may not exist, as can be seen from the example 


2.1V. The product of two a-matrices exists and is an a-matrix. 


Proof. If A and B are a-matrices, and G, given by (2.4), is a y-matrix, 
then by 2. III., H = GB is a y-matrix, i. e. if Cry = hn — An-+& (hx being 9), 
then, by 2. II., C is an a-matrix. But . 


Cnk = DI jx j Anj0 jx; 
j=0 j=0 j=0 


hence C = AB exists and is an a-matrix. 


2.V. A_ sufficient and necessary condition for the matrix product 
H= GA to exist and be a y-matrix for every y-matriz G ts that A should 
be an a-matriz. 


Proof. The sufficiency follows from 2.III. To establish the necessity 
(for every y-matrix G), we select the y-matrix of example (2.11), and then 
= dor + +--+ ++ an, so that H is a y-matrix only if A is an 
a-matrix. 

This result is parallel to a previous résult (Vermes 1) which states that 
a sufficient and necessary condition for the matrix product H = AG to eaist 
and be a y-matrix for every y-matrix G is that A showd be a T-matrix. The 


central position of the series to sequence method is demonstrated by the above 
results. This is also shown by the fact that if the matrices are triangular 


m 
A 


(4 
(2 
m 
| 
| 
| 
0,0,0,--- * * 
a 
th 
TI 
Sn 
* 
me 
the 
ma 
not 
(3. 
be 
| 


nite 


rix. 


ple 


rix, 


ANALYTIC CONTINUATION BY MATRIX METHODS. 545 


matrices, then for a given @ the corresponding equivalent matrices F and 
A are 


(2. 15) fnk = Jnk — Ynks1 for sequence to sequence method, 


(2. 16) Ank = Jnk — Jn-1& for series to series method. 


If G is a general matrix, the methods G and A are equivalent, but the 
methods G and F if and only if 
(2.17) lim ( lim = 0 (Vermes 4). 


2. VI. If A; are a-matrices (7 =0,1,---+,p) and 
+ Bp X09, then the matrix H = (> BjA;)/L ts an a-matriz. 


This follows immediately from a similar theorem on y-matrices (Vermes 1, 
Theorem 1. IT). 

Thus if A and B are «matrices and a and b any numbers such that 
a+b=40, then there exists a unique «-matrix C such that aA-+ 0B 
=(a+b)C. Hence an algebra of a-matrices can be formed in which the 
above mean represents sum, and matrix-product represents product. This 
algebra is incomplete, since the zero matrix is not an a-matrix. 

If C is an a-matrix all of whose columns are equal, and A any a-matrix, 
then CA =C. If B is another e-matrix, then CB = C, hence C(A — B) = 0. 
Thus C has the left-hand zero property. 


3. Continuation of the Taylor series as matrix transformation. A. 
Robinson has proved (Cooke 1) that if f(z) = Sanz", with partial sums 
has radius of convergence r>1, and if for 0< 2’ <1, f(z) 
= 2’)", with partial sums s’,(z), then s’n(z”) = fnxSe(2”), where 
>1, and F= (fx) is a non-negative upper-semi-7’-matrix. 

In this section we develop a similar idea, considering in the first instance 
series to series transformations, obtaining simple expressions for classes of 
matrices depending on one parameter, and investigating some properties of 
these matrices. We also obtain simple expressions for the corresponding 
matrices for sequence to sequence transformations. The investigations are 


not restricted to regular matrices. Let 
(3. 1) f(z) =Co + 


be regular in a circle, and z= be a point inside this circle, so that 


(8.2) = where (7) cape, 


n=0 k=n 


|| 
0), 
uct 
uld 
ity 
1en 
nat 
he 
lar 


546 P. VERMES. 


for n,k =0,1,2,- - - with the convention that 
k k 
(3. 3) oJ= 1 for every 0fork<n. 


If z’ is in the circle of convergence of the series (3. 2), 


n=0 k=n 


and writing B/z’ =, we have 


(3.4) $ (*) A) mega 


n=0 k=n 
Thus the continuation (3.2) of the series (3.1) can be regarded as the 
generalized sum by series to series transformation of the series (3.1) by the 
upper semi-matrix A, where 


k 


This matrix depends only on the ratio B/z’ =A, but not on the particular 
values of 8 or 2’, so that the same matrix represents continuations about other 
points to other values of z. It is independent of the coefficients c, of the 
series. The parameter A may take any complex value, and (3.5) defines a 
family of matrices which we shall denote by A(A). The series to series 
summation method by such a matrix will be called an A(A)-method. 


For any A we have 


(3. 6) Aor + + for a fixed k. 

This is equivalent to 

(3. 7) lim gnzs = 1, G being defined as in (2.4). 


A short calculation gives 


(3. 8) — 9n = — A)" for k =n, =0 fork < n, 
so that 
ee) . coo 
k=0 k=n 


which converges if and only if |A| <1 orA—1. 


Assuming (3.9) we have S, = |1—A|"™1/(1—|.|)"** bounded if 


(3. 
The 
and 
The 
| (3. 
trar 
and 
regi 
Ref. 
reg 
the 
exp: 
= 
(3. 
so t 

| 
(3. 
The 
| and 

| 


= 


ANALYTIC CONTINUATION BY MATRIX METHODS. 547 


(3. 10) 


The last two conditions are simultaneously satisfied if and only if 0=A<1, 
and then the matrix @ satisfies condition (1.1). Also (3.6) shows that 
(1.2) is satisfied. Hence we have proved: 


A(A) is an a-matriz if and only if OSAS1. 
3.11. A(A) is an a-matrix if and only if it is non-negative. 
The second result follows from 3.1 and the formula (3.5). 


(3.11) Definition. If v, denotes the A-transform of the series wo + u, 
+--+, and v’, the A-transform of 0+ u,+u,+-:--, we say that A is 
translative to the left if the existence of } vp» implies the existence of ¥ v’» 
and the equality of both sums. If the reverse holds, we say that A is 
translative to the right. If A is translative to both the right and the left, 
we call it translative. Matrices possessing this latter property are called 
regular by Dienes, Cooke and author in their publications, named in the 
References. To denote ‘translative to the left’ they use the term semi- 
regular. Perron calls a translative method permanent, and Agnew uses 
the expression A permits the adjunction or omission of elements. The 
expressions adopted for this paper are taken over from Hill. 


3.1II. A(A) is translative to the left. 


oo 
Proof. For n=1, tn=D Anette, V'n= so that for n=1, 
k=0 k=0 


=AUn + (1 —A) and =Avo. Hence writing % + 


=n Vo tur, 
(3. 12) = don + (1—A)ona, 
so that lim o, —s implies lim o’, = 8. 
3.IV. A(A) is translative to the right if |1—1/A| <1. 


Proof. From (3.12), writing (A — 1)/A = B, we obtain b"-ja; — b"-i+19;_, 
= for =0,1,- - -,n, hence 


(3.13) on = + 10’, +--+ +> +0'n)/d. 


The right-hand side of (3.13) is the transform of the sequence o’, by the 
triangular matrix fnx = b"*/d which is a T-matrix (Dienes 389) if | b| <1, 
and then o’"—>s implies on s. 


| 


548 P. VERMES. © 


That the condition is not necessary can be seen from 40, when 4 
is the unit matrix, which is obviously translative. 

The product of two matrices of the class A(A) always exists. Writing 
1—A=~p, and using the letters p, g, r in this sense, the matrix A(p) 
shall denote 


k 
(3. 14) Ank = (*) pr(1— p)**. 
3.V. The matrix product A(p)A(q) is the matrix A(pq). 
Proof. If A(p)A(q) =B, 


bx = > p"(1— p)i —q)*) for k=n, =0 for] 


jan 


But (*) "for nSjSk, thus 
j]\n nJ/\j—n 
k k-n 
—=\ — pg), 


and this proves the statement. 


The matrices A(p) form a class in which products are associative and 
commutative. If p40, A(p) has a unique two-sided reciprocal A(1/p) 
belonging to the same class. To p—1 corresponds the unit matrix J, and 
to p=0 the matrix 


We have QA(p) = A(p)Q—Q. 
Formula (3.8) can be written in the form 
(3. 15) — Yn = (1 — A) = 


which in view of (2.15) states that the sequence to sequence summation 
matrix corresponding to A(A) =A(p) is given by 


(3. 16) P(A) = (1—A)A(A) = F(p) =p A(p). 
It follows from (3.15) and 8. V that 
3. VI. The matrix product F(p)F(q) is the matrix F (pq). 


If p=1, F(p) is the unit matrix; if p—0, F(p) is the zero matrix. 


fo 


an 


d 
th 
tl 
m 

(3 
— 0, 0, * 
ge 
sat 
ge 
| 
an 
| 


n A 


ting 


(p) 


) for | 


and 


and 


tion 


TIX. 


ANALYTIC CONTINUATION BY MATRIX METHODS. 549 


For p+ 0 F(p) has the unique two-sided reciprocal F(1/p). The matrices 
F(p) form a class in which products are associative and commutative. 


3. VII. F(A) is a T-matriz if and only if OSA< 1. 


38. VIII. F(A) is a T-matrix if and only if it is non-negative and 
diferent from the zero matriz. 


3.IX. F(A) is translative to the right (left) if and only if A(A) its 
the same. 


The last three results are consequences of (3.16) and can be proved as 
the corresponding results for A(A). 


(3.17) Definition. A summation method A is said to include the 
method B, if each series summable B is also summable A, the two generalized 
sums being equal. 


The methods A(A) and F(A) are in general not equivalent. The rela- 
tionship between them is revealed in the following: 


3.X. For X41 A(A) includes F(X). 

Proof. The convergence of (1—A) implies that as 
k—> —> 0, and therefore, by 2. 4), 0 for every fixed n, so that 
(2.17) is satisfied. 

3. XI. A necessary condition for F(A) to include A(X) is that |A| <1. 

Proof. A(A) sums the series 1+0-+0-+---to1. The F(A)- trans- 
forms of the sequence of partial sums 1,1,- - are on = (1 
and these converge to 1 only if |A| <1. 

3. XII. Jf the Taylor series cyz* has a non-zero radius of conver- 


gence, and if for |X| <1 A(A) sums the series in an open domain D, then 
F(A) sums the series to the same sum at all inner points of D. 


Proof. If z is in the circle of convergence, then (2.17) is obviously 
satisfied. If z is on or outside the circle, we can apply the following Lemma 
(Vermes 4, pp. 72-73): 


Lemma. We suppose that > cx2z* has a finite nonzero radius of conver- 
gence, and that the G-method is efficient for the series in a domain D outside 
or on the circle of convergence. Then the transformation of the series by G 
and the transformation of the sequence s; by F are equivalent if 


550 P. VERMES. 


(3. 18) lim | gn | /* exists and is finite for every n. 


In the present case gnz, given by (2.4) and (3.5), satisfies condition (3. 18), 
the limit being | A | for every n. 


CoroLuary. For any », F(A) includes A(A) on or outside the circle of 
convergence in the domain D. This follows from the last part of the previous 
proof, where it was not required that |A| <1. 

We shall now investigate the domain of efficiency of the method A(A) 
for the series 1+ 2- 2°-++---, and then extend the result to partial star- 


domains. 


38. XIII. A(A) sums the series tf and only if z 
satisfies simultaneously the conditions 


(3. 19) 

and 

(3. 20) }1—A|<]1/z— 

and the sum is the right value 1/(1—z). 

Proof. Applying the matrix A(A) to the series, we obtain 

k 

(3. 21) Un(Z) = D> anz* => (1 (*) (Az)*, 
k=0 k=n 


and excluding the trivial cases A= 0 and A —1, the series converges if and 
only if (3.19) is satisfied, and then 


(3. 22) Un(z) = — Az)"/(1 — 
which requires (3. 20). 


We shall denote by D(A) the common part of the two circular domains 
of which (3.19) represents the inside of a circle with centre at the origin 
passing through the point 1/A, while the boundary of (3.20) is a circle 
passing through z1. The domain (3.20) is the inside or outside of the 
circle if the real part of is less or greater than 1/2, and it is a half-plane 
if RA—1/2. For real AX1/3 (3.20) is inside (3.19), for A> 1/3 the 
circles intersect. D(X) contains the origin for every x. 

The union of the domains D(A) for all values of A is the union of all 
points to which the series can be continued, so that its boundary is the 
envelope of the circles passing through z—1 and the centres of which are 


3 

0 

0 
0 
t 

0 

b 
st 

p 
di 
al 
al 
de 
CC 
he 
be 
la 
dc 
m 
th 
co 
co 
wl 
W 
a 


18), 


‘le of 
vious 


A (A) 
star- 


if 


and 


ains 
‘igin 
ircle 
the 
lane 
the 


all 
the 
are 


ANALYTIC CONTINUATION BY MATRIX METHODS. 5d51 


on the circle |z| 1. The boundary, referred to polar coordinates with 
pole at z—1, is given by p= 4sin? 6/2. 

The formulae (3.19), (3.20) can also be obtained by considering the 
continuation which lead to the matrix A(A) of (3.5). Writing B= dz’, 
B is inside the circle of convergence, which is (3.19). The radius of con- 
vergence of the continued series is |1— |, and 2’ is in this circle if and 
only if | 27 —B|<|1—£ |, which is (3.20). Hence D(A) is independent 
of the nature but dependent on the location of the singularity. 


(3. 23) Definitions. A simply connected domain in the z-plane is said 
to be a star-domain, if it contains the origin, and if every half-ray from the 
origin meets its boundary in at most one point. D(A) is a star-domain, 
as can be verified from (3.19) and (3. 20). 

If f(z) is the function defined by } c.z* and by its analytic continua- 
tions along half-rays from the origin, then the first singularity ¢ reached 
on any half-ray is called a vertex of the principal star-domain. We denote 
by £D the set of all points {z for which z belongs to the set D. If D is a 
star-domain, having z= 1 as a boundary point, we shall call the common 
points of all domains £D), formed from all vertices £ of the principal star- 
domain, the partial star-domain of f(z) formed with D. . If D is bounded, 
and 7” is the least, 7” the greatest distance of its boundary from the origin, 
and if R is the radius of convergence of } cz", then we can modify the 
definition of the partial star-domain in restricting the domains £D to be 
considered to all vertices £, for which || < Rr’/r’. 


3. XIV. A(A) sums the series ¥ cyz* in the partial star-domain formed 
with D(X), and the sum is f(z). 


Proof. The summation represents the ordinary continuation. If $(z) 
has a sole singularity at 2’, ¢(2’v) =y(v) has a sole singularity at v—1, 
hence A(A) sums the Taylor series of ¢(z) outside the region 2’D’(A), D’ 
being the complement of D with respect to the z-plane. Thus each singu- 
larity € requires the exclusion of a region £D’(A). Since D(A) is a star- 
domain, the exclusion contains all regions belonging to singularities which 
may be beyond the vertex. The union of all £D’(A) is the complement of 
the domain common to all D(A). 

We shall now consider summation of the series 1 + 2+ 22+ - - - by two 
consecutive series to series transformations which correspond to a two-fold 
continuation. We first transform the series by A(A) into a series > vj;(z) 
which may converge or diverge. Then we sum this series by the matrix A()’). 
We obtain vj(z) as in (3.22) if |z| <1/|A|. The series }v;(z) is then 


P. VERMES. 


summable by A(A’) if (3.19) and (3.20) are satisfied, replacing z by 
(1—A)/(1/z—A) and dA by X’.. This gives the conditions: 


(3. 24) (i) |1/z]> al, (ii) |1/z—A| > 
(iii) | 1/z—A—N +A’| > |. 


We denote by D[A’(A)] the common part of the three circles in (3. 24), 
representing the domain of efficiency for the series > z* of the transforma- 
tions by A(A) followed by A(X’). The order, in which the transformations 
are carried out, is important: in general D[A’(A) | is different from D[A(Q’) ], 
and both may differ from D(AA’), which is the domain of efficiency for the 
single summation by the product A(A)A(A’). The following examples may 


serve as illustration: 


(3.25) A=—1/2, XY =—1/3. The repeated transformations both include 
D(X’), and neither includes the other. 


(3.26) A=—A’=——1/3. All three domains are equal. 


(3.27) A=41/7, A’ =1/3, applied in this order reaches to the point 
z= — 7, the greatest distance obtainable by two consecutive ‘continuations. 


(3.28) A= (1+ 134) /3, = (4+ 71124)/7. (A)] contains the 
real axis between z—1 and z=—3/2, i.e. it extends partly outside the 


principal star-domain. 


We now turn to the discussion of cases when the matrix A(A) has an 
inverse with respect to a Taylor series. 

(3.29) Definition. If the matrix A transforms a series >) uy into a 
series > v,, and if a matrix B transforms the second series into the original 
series, then B is said to be the inverse of A with respect to the series > ux. 


If A has a two-sided reciprocal, the reciprocal is the inverse, provided 
that it applies to the transformed series. It will be convenient to use the 
notation of (3.14), when the reciprocal of A(p) is A(1/p). 


3.XV. A(1/p) is for pO the inverse of A(p) with respect to 
1+2+2?+--- at the inner points of the segment 


(3.30) (i) (ii) [ez | — pp) — 2}. 


Proof. This is a case of repeated transformations, but the last series 
need not converge. Hence (i) and (ii) of (3. 24). is sufficient with A= 1-—p 


z by 


24), 
rma- 
ions 
the 
may 


lude 


all 


to 


ries 


ANALYTIC CONTINUATION BY MATRIX METHODS. 553 


and ’ = 1— 1/p, and this gives (3.30). It can be verified that the second 
transformation then yields the original series. 

The domain defined by (3.30), which we denote by S(p), is the segment 
of a circle with centre at the origin and passing through z=1/(1—p) cut 
off by a line that bisects perpendicularly the line joining the origin to 
1/(1—p). For p< 1, S(p) contains the singular point z=—1, and for 
suitable p may extend to any point of the z-plane. The union of domains 
S(p) for all values of p is the whole, z-plane. 

38. XVI. A(1/p) is for p~0 the inverse of A(p) with respect to the 
series > cxz* in the partial star-domain formed with S(p). 

This follows from 3. XV in the same way as 3. XIV from 3. XIII. Al- 
though the boundary of S(p) does not pass through z=1, S(p) is a star- 
domain, and the definition can be extended to this case. 

Corotiary 1. Jf R is the radius of convergence of the series > c,2*, 
the partial star-domain contains the circle |z| << R/2|1—p|. 

This follows from the remark at the end of (3.23), since for S(p) 
”=1/2|1—p]|. If RO, we can make the partial star-domain arbitrarily 


large. For z—1 we then obtain: 
Corottary 2. Jf lim|uz|'/*=1/R is finite, A(p) has an inverse 
A(1/p) with respect to Sux for all p satisfying | 1—p|< R/2. 


3. The theorem holds for F(p) and sequence 


4, The transpose of the Taylor series continuation matrix. The trans- 
pose A’ of a matrix A is defined by a’nx = xn. The transpose of A(p) is 


the matrix 
(4. 1) nk = pti —p)** 


with the convention (3.3), a triangular matrix, which is the matrix E(p} 
of the Euler sequence to sequence transformation (Agnew, formula 1. 2). 


4.1. The transpose E(p) of the matrix A(p), p¥0, is a T-matrizx 
if and only if A(p) is an a-matriz. 


This follows from the well-known result that H(p) is regular if and only 
if 0<pX1 (Agnew, page 314), and from 8.I. The transpose of the 
matrix F(p) is the matrix 


(4.2) (2) = 


oint 
the 
the 
inal 
Ux. 
ded 


554 P. VERMES. 


a triangular matrix, which is the matrix €(p) of the Euler series to series 
transformation (Agnew, formula 7. 2). 


4.11. The transpose E(p) of the matrix F(p) ts an a-matrix if and 
only if F(p) ts a T-matriz. 


Proof. Considering the series to sequence transformation matrix @ 
corresponding to €(p) given as in (2.4) we obtain 


(4. 3) lim Ink = ‘) per p)i-*, 


and, as calculated in Agnew’s paper (7.61 and 7. 62), 


n+ 1 n-k pk+1 
(4.4) gue — | 73) 


so that (1.1) and (1.2) are satisfied if and only if 0 << p= 1, which is the 
condition for F(p) to be a T-matrix. 

From (4.4) follows that €(p) is the matrix corresponding not to E(p) 
but to a matrix (@’ni1%41), obtained from H(p) by removing its first row 
and first column. The two methods E(p) and €(p) are not equivalent, as 
can be seen from the example: 

Applying H(p) to the sequence 1,1,-- - we obtain o, —1 for every n, 
i.e. the H(p)-sum of the series 1+0-+0-+--- is 1 for every p. But 
the €(p)-transform of this series converges only if |1—p|<1. This 
follows also from Theorem 7.8. of Agnew’s paper, which gives a complete 
discussion of the two methods. 

There is one point we wish to pursue. Agnew (Theorem 8.2) proves 
that the series ¥ c,z* is summable by E(p) in the partial star-domain formed 
with D(p) (cf. (3.23) of this paper), where 


(4.6) D(p) is given by |z—(1—1/p)| <|1/p| (Agnew 8.11), 
provided that |1—p| <1. 


We shall investigate the case when |1—p|>1, mainly for €(p)- 
summability. It has been shown by Agnew (pp. 327-328) that the series 
> 2 is summable €(p) for p+ 0 in the circle (4.6). Our aim is to state 
sufficient conditions under which the series 5 cxz* is summable €(p) when 
|1—p|>1. In the case treated by Agnew the origin is inside the circle 
(4.6), so that D(p) is a star-domain, while in the present case the origin 
is outside. 


ar 


te 


| 
T 
‘ 
‘he 
(4 
R( 
(1 
(4 
wh 
val 
(4 
fo 
| Th 
| at 
(4. 
Fo 
(4. 


jes 


ANALYTIC CONTINUATION BY MATRIX METHODS. 5d5 


4.1II. If the function f(z), represented by the series S cya! and tts 


0 
analytic continuations, has a single finite singularity at z= £0, then it ts 
summable E(p) for |1—p!|>1 in the circle £D(p), provided that f(1/z) 
tends to zero uniformly as z—> 0. 


Proof. If z is in £D(p), then z/{ is in D(p), hence (4.6) gives 
(4. 7) |1— pt pz/E| <1 for all z in £D(p). 
The functions of the complex variable w 
oO 
(£8) pw"f(w) & (1 — p+ 
are equal for all values of w satisfying 


(4. 9) |1—p+ pz/w| <1, 


‘hence their residues are equal. 


Denoting the residue of ¢(w) at w=a by R(¢,«), (4.8) gives 
(4. 10) + R($,2) =R(y, 6) + 0). | 


Now, R(¢,z)—f(z), and at (4.9) is satisfied, hence 
R(¢,£) = R(y,f). Also substituting for f(w) the series c,w*, and for 
(1— p+ pz/w)” its binomial expansion, (4.10) becomes 


n=0 k=0 k 
whenever the series converges. Considering the functions of the complex 
variable ¢ 
(4.12) Vn(t) = pt“f(1/t) (1 — pt pat)” n=0,1,°°~, 


for |¢|/=r>1/|¢| we have ¢“*f(1/t) hence inte- 
grating term by term we obtain 


(4. 13) (1/277) Vu(t) dt = (1 — = (2). 


|t|=r 
The only singularities of the integrand in and on the path of integration are 
at t= 1/€ and t = 0, hence as in (4. 10) 


(4. 14) Un(z) = R(Vn, + R(Vi, 0). 
For a fixed z satisfying (4.7), and for |1—p+pz/|<@<1 
(4. 15) | R( Vn, 1/€)| S KO, 


| 
e 
4 
t 
is 
e 
d 
e 
n 
5 


556 P. VERMES. 


K being independent of n. Also, for a sufficiently small positive p and | ¢ | Sp 
|1—p-+ pet |" Kk’, and since f(1/t) > 0 uniformly as 0, 
(4. 16) R(V 4,0) =0. 

From (4.14), (4.15) and (4.16) follows the convergence of > v,(z), 
and (4.11) is established. 


1. Jf f(z) has several singularities € (540), it ts sum- 
mable E(p) in the domain common to all £D(p). 


CoroLtuaRY 2. Under the conditions of the theorem or of Corollary 1 
the series is summable E(p). This follows from Agnew, Theorem 7. 8, which 
states that includes €(p). 

Examples. 

(4.17) If f(z) =(1—2z)> given by 14 22+ 327+---, we have 
Un(z) = p(1 — pt pz)" + np?z(1— pt pz)", hence in D(p) (p¥0) 
vn(z) = (1 —z) 1+ — = (1—z)*. Here f(1/z) > 0 uniformly. 

(4.18) If f(z) =27(1—z)~ given by the series 0 + 0+ 27+ 22 
= np?2z(1 — p+ — pst pz)" + and 
D> vn(z) converges in D(p) if and only if |1—p|<1. Here f(1/z) >1. 

(4.19) If f(z) =—z" log (1—z) given by 14 2/2+42°/3+-:-, 
Un(z) = [(1— pt pz)"™**— (1— and Svn(z) converges 
in D(p) if and only if.)1—p|<1. Here f(1/z) +0 as z—0, but not 
uniformly, considering all branches of the multivalued function. 


5. Continuation of the Laurent series as a matrix transformation.’ 
In this section we first consider the series to series transformation which 
represents the continuation of the Laurent series by a Taylor series. We 
obtain families of matrices similar to those in section 3 with corresponding 
expressions for sequence to sequence transformation matrices. 


(5.1) Let f(z) = Scx(B—z)~* represent a function regular at all 
k=0 
finite points of the z-plane except at z == 8 ~0. Expressed as a Taylor series 
> Crz" in the circle | z | < B, the coefficients C,, obtained from the expansions 
of (8 —z)~*, are C, => Thus in the circle | z| < B 


k=0 


n=0 k=0 


2I wish to thank Mr. W. Weinstein, who gave me the idea of investigating the 


continuation of Laurent series. 


a 
( 
S] 
( 
tl 
fi 
ot 
la 
(2 
so 
Sn 
< 
ni 
(5. 
he 
(5. 
sO 


/\ 


ons 


the 


Or 
cr 


ANALYTIC CONTINUATION 1 M‘*TRIX METHODS. 
and writing 1— 


n=0 k=0 


Thus the continuation of the series (5.1) can be regarded as the generalized 
sum by series to series transformation by the matrix A(t), where 


(5.3) an(t) =(* n, =0,1,2,° °° 


The matrix A(¢) is independent of the particular values of z and B, and of 
the coefficients c;. The parameter ¢ for all complex values of ¢ defines a 
family of matrices. 


5.1. A(t) is an a-matriv if and only if 0< tS 1. 


Proof. Obviousiy A(0) is not an a-matrix, A(1) is an a-matrix. For 
other values of ¢ the corresponding series to sequence transformation matrix 
G, given as in (2.4) is gn: ') (1 and a short calcu- 

j=0 
lation gives 


k 
(5. 4) — Qn k+1 -( "ya — 


so that | | converges if and only if |¢| <1, and then 
k 


Sn—=[|1—t|/(1—|t])]"*. This is bounded if and only if |1—t#| 
<1—J|t|. The two conditions together require 0<t <1, and then 
1 as n—> for every We also have: 


5.II. A(t), t40, is an a-matrix if and only if it is non-negative. 
5.1IT. A(t), £40, is translative to the right. 
Proof. With the notation of (3.11) for » >0 we obtain 
(5. 5) nei — (1—t)v’n = and = tv, 
(5. 6) — (1— o'n = ona, (n == 0,1, -), 
so that lim o’, —s implies lim on, =s. 
5.IV. A(t) is translative to the left for |1—t| <1. 


Proof. From (5.6) follows 


z), 
yl 
ich 
ave 
0) 
| ly. 
22° 
nd 
> 1. 
ee 
10t 
ich 
Ve 
ing 
all 

B 


558 P. VERMES. 


(5. 7) on = t(1—t)"oyo + — t)"70, +: + ton 
and the result follows as in 3. IV. 
Formula (5.4) shows that the sequence to sequence summation matrix 


corresponding to A(¢) is the matrix F(t), where 


k+n 


n 


(5. 8) fnx(t) -( (1/t — 1) dn (t). 


5.V. F(t) ts a T-matrix if and only if O0<t<1. 

This follows from (5.4) and (5.5), and F(1) is the zero matrix. 

5. VI. Excluding t—0 and t=—1, F(t) is translative to the right 
(left) if and only if A(t) is the same. 


The proof follows from (5.8) applying (5.5) to the sequences 
0, 0, So, and 0, - respectively, where o, now denotes the F(t) 
transform of the sequence s;. The result then follows as in 5. III and 
5. IV. 


5. VII. A(t) includes F(t) for ¢1. 


For ¢=0 the result is trivial When ¢40 and the sequence s is 
summable F(t), it follows from 5. III and 5. VI that s;,, is summable F'(¢), 
so that, by (5.8), 


Seer = (1/4 — 1) na 


is convergent for all n. 
The result then follows as in 3. X. We also have as in 3. XI: 


5. VIII. A necessary condition that F(t) should include A(t) is that 
<1. 
5.1X. If the Taylor series > cxz* has non-zero radius of convergence, 


and if for |t| <1 A(t) sums the series in a domain D, then F(t) sums 
the series to the same sum in D. 


~ 


Corottary. For any t, F(t) includes A(t) on or outside the circle of 
convergence in D. The proof is the same as of 3. XII. 


5.X. A(t) sums the series 1+2+2°+--- for every t at z=0, 
and for t 40 if and only if z satisfies simultaneously 


(5. 9) 


0 
| 
be 


rix 


vat 


ANALYTIC CONTINUATION BY MATRIX METHODS. 559 


and 
(5.10) 


and the sum is the right value (1—z)-. 


Proof. Applying A(t) to the series we obtain vo(z) = (1—?#z)-, and 
for n > 0, un(z) = (1 —1t)"tz(1— tz)" provided that z= 0 or | tz| <1. 
Then > vn(z) = (1—z)-? and only if z=0 or |(1—t)/(1—¢#)| <1. 
This concludes the proof. 

Thus the method is efficient at z 0 and in a crescent inside the circle 
(5.9) and outside the circle (5.10). The circle (5.9) has its centre at 
the origin and passes through z =1/t, the circle (5.10) has its centre at 
z==1/t and passes through z=1. Denoting the domain (5.9) (5.10) by 
D(t), the origin belongs to D(f) if the real part of 1/¢ is greater than 4, 
ie. if | 1—t|<1. The circle (5.9) is inside (5.10), hence D(t) is empty, 
if |1—t|>2. In this case A(¢) is efficient at the single point z = 0 only. 
If1<|1—t| <2, A(t) is efficient in D(t) and at the isolated point z= 0. 
If —1<t<0, D(t) has no point common with the circle of convergence. 

Of particular interest are the methods A(t), for which the origin is in 
D(t), i.e. for which |1—t|<1. The domain D(¢) is not a star-domain 
as defined in (3.23), since the half-ray from the origin to an intersection 
of the two circles meets the-circle (5.10) again inside the circle (5.9). For 
a suitably small |¢|, varying the centre of (5.10) so that |1—¢| <1, 
D(t) can be made to include any point but z—1. Hence the union of 
domains D(t) with |1—t| <1 is the whole z-plane with z=1 and z= 
excluded. 

The a-matrices A(?), i.e. for which 0 < t= 1, form a subclass of the above 
class. The union of domains D(t) of this subclass has as boundary the locus 
of intersections of the circles (5.9) and (5.10). In polar coordinates 
with respect to z = 1 this boundary is p = (8 cos* 6 — 4 cos 6) /(1 — 4 cos? @), 
with asymptotes 6 = + 7/3, the two branches meeting at 
z==1, where the tangents make angles + 7/4 with the real axis. Thus 
the union of regular A(¢) matrices is efficient in a domain which extends 
beyond the Borel half-plane. 


(5.11) Definitions. Let the series } cyz* have a finite non-zero radius 
of convergence. Let LZ be an open Jordan curve joining z=0 and z—1. 
We denote by £Z the curve formed from the points ¢z’, when 2 are all points 
of LZ. If the series can be continued along ¢Z but not at %, we say that ¢ is a 
vertex of the curvilinear star-domain S(L), and the function f(z), defined 


ht 

es 

t) 
nd 

18 

ce, 

ns 

of 


560 P. VERMES. 


by analytic continuations along all curves zZ for which z is not a vertex, is 
said to be defined in the curvilinear star-domain S(L). 

The domain D(t) for |t| <1 and |1—t|<1 can be divided into 
two symmetrical halves each of which is a curvilinear star-domain corre- 
sponding to symmetrical circular ares LZ and L’. The partial curvilinear 
star-domain of f(z) formed with D(t) is the common part of all domains 
¢D(t) for which ¢ is a vertex of either of the curvilinear star-domains S(L) 
or S(L’) and for which ¢ is in the circle | | < Rr’/r’ (cf. 3. 23)). 


5. XI. If |t|<1 and |1—t|<1, A(t) sums the series cyz* in 
the partial curvilinear star-domain formed with D(t), and the sum is the 
value of f(z) obtained by continuation in this domain. 


Proof. For a fixed ¢ and a regular point z of f(z) we form the series 
of functions of the complex variable v 


ao 

(5.12) = ta(1 —t) *F(v) [(1 —t)/(1 — J 
n=0 n= 

where ¢n(v) can be represented by the series 


(5.13) oa(v) = t2(1—t)7vf(v) ‘a — t)"1(tz/v)*. 


nr 


Both series converge in the domain given by 
(5.14) (i) (ii) >| te], 
and which we shall denote by V(?). 


Here (i) represents the outside of a circle surrounding v =i?z, the 
origin being outside and v =z on this circle, while (ii) represents the out- 
side of a circle with centre at the origin and passing through v = éz. 


(5.15) We suppose that all singularities of f(v) are in V(t). 


Constructing three closed Jordan curves Cj (j =—1,2,3) from arcs 
concentric with the boundary of V(t) such that C; is inside Cj,, and all 
singularities of f(v) outside C;, we have in the ring between C, and C3; 


(5. 16) 2f(v)/v(v—2), 
and 
(5.17) (1/2ni) { San(v)}dv = f(z) —f(0). 


Cz n=0 


7~ 


( 
7 
( 
fc 
( 
Ww 
8 
1 
W 
ir 
( 


1e 


ANALYTIC CONTINUATION BY MATRIX METHODS. 561 


Substituting for ¢,(v) the series (5.13), and integrating term by term 
the double series, which is uniformly convergent in the ring between C, and 
C;, we obtain 

co k 


or rewritten 


(5. 19) ("+ ‘) (1 = f(z). 


n=0 k=-0 


This was obtained under the supposition (5.15), hence if v= is a singu- 
larity of f(v), (5.14) is satisfied for v=, i.e. 


(5.20) (i) (ii) [ee < [1/4]. 


Comparison with (5.9) and (5.10) shows that (5. 20) requires z/f to be in 
D(t), i.e. z is in ED(t). This proves the theorem. 


Cr 


CoroLLary. Under the conditions of the theorem F(t) sums the series 
to the same value. This follows from 5. IX. 


(5.21) Haample. The series 0 + 2/1 + 27/2-+- - - represents the func- 


tion f(z) =—log (1—z). Applying A(t) with | 1—t|< 1, we obtain 
for z in D(t) 

(5.22) DS vn(z) =— log [1— (1—1#)/(1— tz) ] + log t — log (1—2?z), 
where log (1 + uw) means the value of the function defined by the convergent 
series u-+ u?/2-+---. Hence the sum is — log (1—z) + 2mmi (m=0, 


1 or —1) m depending on ¢ and z. 


If t = (1 — /5, D(t) contains pvints 1 < z < 54, so that z= 2 is in 
D(t), and for this value of z, } v,(z) =—vi, which is the value of f(z) 
when the series is continued in D(t). If ¢ is the conjugate of the previous 
value, the generalized sum at z = 2 is + zi, corresponding to the continuation 


in D(t). If z=—2, which is in D(t) for both values of ¢, we obtain in 
both cases ¥ v,(z) = — log 3. 


To conclude, we show the connection between the matrices A(A) of 
k 
(3.5) and A(t) of (5.3). Extending the definition of (‘) to negative 
integral 


( 0 )-1 


8 
0 
r 
n 


562 P. VERMES. 


and substituting 1/¢ for A, (3.5) gives 


Gn -%(A) = ee, (1 —1/t)" (1 — t) — (t). 


I wish to express my thanks to Professor P. Dienes and Dr. R. G. Cooke 
for their kindness in reading the manuscript and making some useful com- 
ments on it. I am also indebted to Birkbeck College for the award of the 
Armitage-Smith Memorial Prize for 1945/46, during the tenure of which 
the main part of this work was done. 


BIRKBECK COLLEGE, 
UNIVERSITY OF LONDON, ENGLAND. 


REFERENCES. 


R. P. Agnew, “ Euler transformations,’ American Journal of Mathematics, vol. 66 
(1944), pp. 313-338. 
R. G. Cooke, 1. Lectures at Birkbeck College, University of London. 
2. Infinite Matrices and Sequence Spaces, in the press: Macmillan & Co., Ltd. 
P. Dienes, The Taylor Series, Oxford, 1931. 
J. D. Hill, “Some properties of summability,’ Duke Mathematical Journal, vol. 9 
(1942), pp. 373-381. : 
. Perron, “Zur Theorie der divergenten Reihen,’ Mathematische Zeitschrift, vol. 6 
(1920), pp. 158-160 and 286-310. 
. Vermes, 1. “ Product of a 7-matrix and a y-matrix,” Journal of the London Mathe- 
matical Society, vol. 21 (1946), pp. 129-134. 
2. “On y-matrices and their application to the binomial series,” Proceedings 
of the Edinburgh Mathematical Society, ser. 3, vol. 8 (1947), pp. 1-13. 
3. “The application of y-matrices to Taylor series,” the same Proceedings, 
ser. 2, vol. 8 (1948), pp. 43-49. 
4, “Gamma matrices and their application to infinite series,” Thesis, accepted 
for the award of the Ph. D. degree of the University of London (1947). 


t 
¢ 


ha 


] 
tl 
CC 
= 


ted 


A CLASS OF INTEGRO-DIFFERENTIAL EQUATIONS.* + 


By SuHin-Hsun CHANG. 


1, Introduction. In the present paper, I shall consider equations of 
the form: 


and 
(B) t)/dt + p(t)p(2, t) = t) + f Xe, y)T (ys ) dy. 


It will be assumed that the kernel K (2, y) is such that 


nay 


is a bounded function of « in aXax<b, that f(z,t,¢) is a continuous 
function of (2,¢,¢) for a=ax=b, t and ¢ being allowed to take all real 
values, and that p(/) and g(a, ¢) are continuous functions of ¢t and of (2, t) 
respectively. The unknown function to be determined is ¢(z, ¢). 

The equation 


b 
(Ao) (2,t)/at— J” K(x, 


which is a particular case of (A), was considered by Volterra? in connection 
with the solution of linear functional derivative equations. He showed that 
the unique solution of (A,) which reduces to a given continuous function 
When is given by 
b 
= + f L(x, 


where 
= ((t—t0)*/n!) Kn (2,9), 


the functions K,,(x, y) being the iterated kernels of K(2,y). It is sometimes 
convenient to be able to express the solutions of such equations in terms of 


* Received December 30, 1947. 
1 My warmest thanks are due to Dr. F. Smithies, whose encouragement and advice 
have been of the utmost value to me in the course of this work. 
2V. Volterra, (1) p. 394. 
563 


ke 
he 
66 
td. 
9 
he- 
gs, 
_| 


564 SHIH-HSUN CHANG. 


the characteristic functions and characteristic values of K(z,y). Barnett * 
has obtained such a result for the special equation (A,) when K(2,y) is 
symmetric or skew-symmetric. 

In the present paper I shall express the solutions of (A) and (B), for a 
general kernel K(2,y), in terms of singular functions and singular values 
of the kernel; the results will include in particular a solution of (A,) for 
a general kernel similar to Barnett’s solution for symmetric and skew- 
symmetric kernels. 

We shall require some results about the theory of infinite systems of 
non-linear differential equations which, in order to save space, we assume the 
reader can easily find in Wintner’s papers.‘ | 


2. Solution of Equation (A). We now discuss the solution of equa- 
tions of the form (A). We shall suppose that K(z,y) is a real LZ? kernel 
defined for aS a=b, such that 


(1) f K?(a, y) dy 


is a bounded function of 2, that {An} is its system of singular values, and 
{¢n(x), ¥n(y)} is its complete orthonormal system of adjoint pairs of singular 
functions, so that 


f K (a, y)ualy) dy, 
(2) 
K (2, y)4a(y)dy. 
We write K (2, y) = K[¢x(2), wn(y), An]. 


THEOREM 1. Let K(x, y) satisfy the above conditions and let f(x, t, uv) 
be a continuous function of (x, t,u) foraSxSb and all real values of t 
and u such that: 


(3) 215 (1/6) f t, anda(y) ve) dy 


is for all values of 1 a regular power series in the variables zo = t, 21, Z2,° * *, 
in the domain o, defined by 


(4) S| 


37. A. Barnett, (2). 
4A. Wintner, (3), (4), (5), (6). 
5 A, Wintner, (6), p. 242. 


for 


Sr 
im 
(5 
for 
eq’ 
(A 
po 
(6 
(7 
If, 
(9 
0 < 
{®, 
(1 
sol 


A CLASS OF INTEGRO-DIFFERENTIAL EQUATIONS. , 565 


Suppose also that for all 


B= (Zo, 21, Z2,° * *) 
in oy we have 
m 


for some constant A independent of m and t. Then the integro-differential 
equation : 


b 
(A) (a, t)/at— f° K(x (yt 
possesses a solution $(2,¢t) satisfying the initial condition: 
and given by the formula 

(7) 1) = 
where z;(t) satisfies the system of equations: 

dz./dt = = i, 
(8) dz;/dt = = Bi (Zo, 21, 22,° *) (‘=1, 

zi (0) = (), 
If, in addition, we have 

(9) | f(a, t,a@) —f(a,t,u)| SC|a—u| 


for some constant C, the solution of (A) so obtained is unique. 
By Bessel’s inequality applied to (3), we have for all 2 in o; 
20 20 
Slade f [f(y t Pay < 
q=1 


We suppose the singular values {An}, which are all real, arranged so that 
then as h->«. Hence the sequence 
is bounded in o; i.e., 


co 
(10) > | ~, for <2’. 
i=1 i=0 
By Wintner’s existence theorem,® the system of equations (8) possesses a 


co 
solution such that > | 2; |*< + © provided that 
4=0 
|t |< 7r/2[@],, 


A, Wintner, (6), p. 251. 


> 
> 


566 SHIH-HSUN CHANG. 


where [4], is the least value of B satisfying (10). We therefore have 


(11) = (1/0) f t, )ya(y) dy, 
2,(0) =0, 
for 1,2,:-:-. 
We now distinguish two cases. Suppose first that K(2z,y) has only a 
finite number of singular values, so that we may write: 


We construct the functions 
= ~ pi 
We then have, by (11), 
m m b 
dpm (2, t) /at (dz;/dt) (pi(2)/Ai) f f(y, t, dm (y, t) )yi(y)dy 
b 
ay. 

Thus ¢m(z,t) satisfies the integro-differential equation (A) together with 


the initial condition (6). 
Now suppose that K(2z,y) has an infinity of singular values. We shall 


prove that 

exists, and that 

(13) t) /t = (dzi/dt) lim (a, t) 


The results will then follow. 
Applying Bessel’s inequality to (11), we obtain 
co b oo 
f Ut adrly)) Pay < 
a 
Hence, by Cauchy’s inequality, we have: 


< A[K.(z,x)]*< AM, 


where 


K.(2, y) — K (2,8) 


Bi 
an 


as 


al 
is 
te 
a 
| 
Tl 
in 
eq 
co 
( 
| He 


A CLASS OF INTEGRO-DIFFERENTIAL EQUATIONS. 567 


co 
and M? is the upper bound of K.(z,x) ina Sab. The series 
i=1 
is therefore boundedly convergent, and may therefore be integrated term by 
term with respect to ¢. The existence of ¢(2z,¢) and the equations (12) 


and (13) are therefore proved. 
By a well known expansion theorem,’ we have 


db 


t=m+1 


Hence, by Cauchy’s and Bessel’s inequalities, 


| xm(2, t) | ‘= t, dm(Y; t) ) 


i=m+1 


<= A? 0 as m> 


i=m+1 
But d¢m(2a, t) /dt > 06(x, t)/0t as m— oo, and, by dominated convergence 
and the continuity of f(y, ¢,w), 


esc y) f(y ts dm(y, t) )dy > y) f(y, t) ) dy 


as m—> oo. Hence 


Thus ¢(z,¢) is a solution of equation (A), and it obviously satisfies the 
initial condition (6). This completes the existence proof. 

We now come to the uniqueness part of the theorem. Suppose that the 
equation (A) has two solutions ¢(z,¢) and ¢(2,¢) satisfying the initial 
condition (6); we then have 


(2, t) /at — 06 (2, 1) “K(x, EF (y, $Y, 
so that 

— f au f Ka, — lay. 
Hence, by (9), 


7F. Smithies (7), or E. Schmidt (8). 


a 


568 SHIH-HSUN CHANG. 


(14) | $(2,t) —¢(2,t)|S Sia K(x, y)|C | o(y, uv) —o(y, u) | dy 
<CG@N 


where G is the upper bound of | ¢(y, uv) —¢(y,u)| for aSySb, and | 
fixed interval of values of wu containing the interval (0,4), and N is the | 


upper bound of 


f | K (a, y)| dy 


for aSx=b. Repeating the argument, we obtain from (14): 
| t) —G(a, t)| S C?GN? | t | 2723, 


and we eventually obtain, by induction, 


t) — (a, t)| S(C"GN" | t | "/n!) 30 (n—> 0). 
Hence ¢(x,¢) and ¢(2,¢) are identically equal. 
Corottary 1. Let Q=(m,.:--*) be a point of the domain: 
, 
>| m |? < 
and let 
(15) $o() = (2), 


the series being convergent in mean square. If the power -series 
(t, 0. + my + m2, ° (1 = 1, 2, 3,- 
are regular in the domain: 
then the integro-differential equation (A) has a solution $(2x,t) satisfying 
the initial condition (x, 0) = do(2). 
We have, by Minkowski’s inequality, 
2 
so that 
b m 
i= 


the vector + m1, is therefore bounded, and the system 
of equations 


h 


| 
| 
( 
( 
| 
( 
( 


ly 


und | 
the | 


ng 


A CLASS OF INTEGRO-DIFFERENTIAL EQUATIONS. 569 


dé;/dt = ©; (t, +m, + m,° 
b oo 
= (1/2) +m) ay) 


has a solution § satisfying the initial conditions §(0) =0. This gives the 
required result. 
Theorems similar to Theorem 1 can be established for the adjoint integro- 


differential equation 
b 


and for systems of integro-differential equations of the form 


b 
(An) Opi (x, t) /dt = f KO (x,y) t, t) )dy 


by exactly similar methods. 


3. Special cases. The simplest special case is that in which f(z, t, wu) 
=u, so that we have to solve the equation 


(Ay) K(x, y) oly, tay 


with the initial condition 


(16) = do(2). 


_ Barnett S showed that if A(a,y) is symmetric, the solution of the problem 


is given by 


where {A;} is the set of characteristic values of K(a,y), and {$i («)} is the 
corresponding complete orthonormal system of characteristic functions. We 
shall call equation (17) Barnett’s formula. 

Now suppose that K(z,y) is a general kernel, whose set of singular 
values is {A,} and whose complete orthonormal system of pairs of adjoint 
singular functions is {¢n(r),¥a(y)}. The system of differential equations 


(8) then reduces to 

(18) dz;/dt = > a4j2; (i =1, 2, 3,---), 
j=1 


where 


8J. A. Barnett. (2). 


| 
| 
| 
| 


570 SHIH-HSUN CHANG. 


(19) — (in Ya) /As —= 


Let (m,72,° °°) be a set of numbers defined as in Corollary 1, and let 
Ci(t) =2zi(t) —i. Then, by (18), 


(20) — +15) = 


say. Since > | ai; | 7 1/A,?, the series in (20) is uniformly convergent in 


j=l 00 
any domain of the form 5 | %;|* < r?, and satisfies all the conditions for 
j=l 
being a regular power series in any such domain. Hence, by Corollary 1, 


the system of equations (18) has a solution satisfying the initial conditions: 


(21) 24 (0) = (t= 1,2,--*). 
We can determine this solution by Wintner’s method ® as follows. We 
have 
co 
Si, (t) = (0, 0,- (1 = 1, 2,- “). 
=1 
Hence,’° 
ao 
= 4ij (45 ainm) 
j=1 h=1 
ao 
= aijns + t Daiji nj, 
j=1 n= 
where 
oo 
aij?) = ainda; 
h=1 
consequently 


oO co 
Sio(t) = t + (0/2!) aaj nj. 
j=1 j= 


We obtain by induction 


n co 
Sin(t) => (7k!) 
=1 


k=1 


where a;;) is defined recursively by the equation 


® A. Wintner, (6), p. 251. 


10 Following Wintner’s notation, if g(t) = 3 c,tv, we use the expression [g(t)], 
v=0 


n 
to denote 3 c,tv. 


v=0 


al 


(2 


K 


(2: 


wh 


the 
uni 


. 
| 


et 


le 


A CLASS OF INTEGRO-DIFFERENTIAL EQUATIONS. 571 


@) 
h=1 
and we have 


co 
aij) 
h=1 


Thus the solution of (18) satisfying the initial condition (21) is 


(22) a(t) (#/k!) Say (i= ++), 


where we have taken 
=0 (i¥j) 


The solution of (A) satisfying the initial condition (16) is therefore 


Since f(z, t, uw) = wu satisfies the condition (9) this solution is unique. 


Equation (23) contains Barnett’s formula as a particular case; for, if 
K(z,y) is symmetric, we have 


ai) =0 (ij), 


We can use these results to solve the integro-differential equation: 


b 
(24) (2, t) — f(t) K (2, y)o(y, t)dy, 


where f(t) is any positive integrable function of ¢, with the initial condition 
$(z,0) —¢o(x). For, if we change to the new variable: 


W = f 


the equation reduces to one of the form (A,). We find that (24) has the 
unique solution 


k=0 


6 


or 
1, 
S: 
4=1 x=0 
n 
| 


572 SHIH-HSUN CHANG. 


4. More general equations. In order to solve the more general equation 
b 
(B) t)/dt) + p(t)o(2, t) = + K(x, y) f(y, o(y, t))dy, 
we put 
t 
w(a t) = (2,1) exp ( “p(u)du) 
t u 
— exp ( f “p(v)dv)au, 
so that 
t 
dw (x, t) /dt t)/dt + p(t) -$(x,t) —q(z, t)] exp ( f p(u) du). 


t 
Thus, if we multiply both sides of (B) by exp ( f p(u)du), we obtain the 
equation : 


(25) dw(a exp ( f p(u)du)- K(x, y) fly, 


where 

t ‘ t u 
—exp (— p(u)du)[w(y, t) + f exp ( 
This is of the form (A), and can be treated by the methods of 2. 


As a particular case of this, we can show that the equation 


(2, p(t)o(z,t) + K(x, yo(yst)dy, 


with the initial condition $(27,0) = ¢ (2x), has the solution 


This is a generalization of another formula of Barnett." 
Note. If in 2 we assume that K(z,y) is an L? kernel without further 
restriction, then 


b 
f 


is finite for almost all z, and the series > ($:7(z)/A4”) converges for almost 
all therefore the series = i (x) 24 (t) is dominatedly convergent for almost 


all z, and can be integrated term by term with respect to ¢ for almost all z, 


* 117, A. Barnett, (2), p. 201, equation (13). 


| 


8 
| 
| 
(1 
( 
(3 
(4 
(6 
(7 
(8 


ion 


the 


er 


st 
st 


A CLASS OF INTEGRO-DIFFERENTIAL EQUATIONS. 573 


i.e. the function (12) exists and (13) is true for almost all z. In this case 
we have: ?? 


b 
bm (2, t) /at K(z, y) f(y, t, dm(Y, t) dy 


= lim 1, t, pm(Y, t)ys(y)dy 


i=m+1 
(as m—> 


i. e. 
t)/ot (ys t, #)) dy 


for almost all z. It is also easy to show that if the equation (A) has two 
solutions ¢(z,¢) and 4(2,¢) satisfying the initial condition (6), then 


$(2, t) = $(2, t) 
for almost all z. 


FITZWILLIAM HOUSE, 
CAMBRIDGE, ENGLAND 
AND 

SZECHUAN, CHINA. 


REFERENCES. 


(1) V. Volterra, “ Sulle equazioni alle derivate funzionali,” Atti, Reale Accademia det 
Lincei, serie 5, vol. 23 (1914), pp. 393-399. 

(2) I. A. Barnett, “ Integro-differential equations with constant limits of integration, 
Bulletin of the American Mathematical Society, vol. 26 (1919), pp. 192-203. 

(3) A. Wintner, “Zur Theorie der unendlichen Differentialsysteme,” Mathematische 
Annalen, vol. 95 (1926), pp. 544-556. 

(4) A. Wintner, “ Zur Lésung von Differentialsystem mit unendlich vielen Verander- 
lichen,” Mathematische Annalen, vol. 98 (1927), pp. 273-280. 

(5) A. Wintner, “ Zur Analysis im Hilbertschen Raume,” Mathematische Zeitschrift, 
vol. 28 (1928), pp. 451-470. 

(6) A. Wintner, “ Upon a theory of infinite systems of non-linear implicit and differ- 
ential equations,’ American Journal of Mathematics, vol. 53 (1931), pp. 
241-257. 

(7) F. Smithies, “The eigen-values and singular values of integral equations,” Pro- 
ceedings of the London Mathematical Society, series 2, vol. 43 (1937), pp. 
255-279. 

(8) E. Schmidt, “Zur Theorie der linearen und nichtlinearen Integralgleichungen,” 
Mathematische Annalen, vol. 63 (1907), pp. 433-476. 


2? 


12 F, Smithies, (7), p. 269. 


— 

| | 

| 


STRUCTURE OF THE HOMOTOPY GROUPS OF MAPPING 
SPACES.* 


By SZE-TSEN Hv. 


1. Introduction. In 1939, M. Abe [1] established a theorem con- 
cerning the structure of the fundamental group of the space of spherical 
mappings into a metric space. A recent paper of the author [5] solves the 
structure of the higher homotopy groups of the same space. After the pub- 
lication of the paper [5], A. D. Wallace proposed to the author the problem 
to determine the structure of the homotopy groups of a general mapping 
space. The present work is a consequence of this fruitful suggestion. 

Throughout the present paper, let Y denote a pathwise connected topo- 
logical space and y) a given point of Y. Let X be a connected finite 
geometric simplicial complex, XY» a closed subcomplex of X (not necessarily 
connected), and X.=— X¥—X, its open complement. Let m denote the 
dimension of X.«. 

Let us consider the totality of the mappings (i.e. continuous trans- 
formations) f:X¥—Y with f(Xo) =y. These mappings form a space 2 
with the compact-open topology of R. H. Fox [4], which can be described as 
follows. For any two sets A in X and W in Y, let M(A,-W) denote the set 
of mappings fe for which f(A) C W. The compact-open topology of 2 
is defined by selecting as a subbase [6, p. 6] for the open sets of © the sets 
M(A,W) where A ranges over the compact subsets of X and W ranges 
over the open subsets of Y. Let 0: X¥—Y denote the constant mapping 
O(X) =y. We shall denote by 2, the path-component of © containing 0, 
i.e. the set of mappings fe which can be connected to O by a path in Q. 
For simplicity, we shall always denote by m,—an(Y), n21, the n-th 
homotopy group of Y with ye Y as the base point. Our primary object is 
to study the structure of the homotopy groups z,(Q), (r=1, 2,° - +), with 
O €Q, as the base point. 

Our main results can be briefly summarized as follows. For each 
r=1, there is a homomorphism (called the tubular homomorphism) 
tr: mr()) >2r(¥) such that 7, is onto if XY, is empty and 7, is trivial 
(i.e. its image consists of a single element) otherwise. The kernel K of 1, 


* Received November 8, 1948. 
574 


el 


| 

( 

tt 

i 

f 

SO 

WwW 

re 

m 

sa 

S 


HOMOTOPY GROUPS OF MAPPING SPACES. 575 


(called the principal subgroup of mr(Qo)) is a solvable group [7, p. 15] and 
has a decreasing sequence of subgroups 


+O 


such that each is a normal subgroup of the preceding and that the quotient 
group Ky1/Kn (n=1,2,:-+,m) is abelian and isomorphic with the 
difference group 


Tnsr) = pr (X., Tnsr) — R( Xs, 


where the two groups displayed on the right member are subgroups of the 
n-th cohomology group H"(X+, an.) of the open subcomplex X.—= X — Xp 
(which is an abstract complex in the sense of A. W. Tucker [6, p. 89]) with 
the homotopy group as the coefficient group. If is non- 
vacuous, then z-(Q)) = K ; otherwise, there exists a subgroup T (called the 
tubular subgroup) of z-(Q.) which is mapped isomorphically onto 2-(Y) by tr. 
Then it follows that: (1) if r > 1, #-(Q) is the direct sum of the subgroups 
T and K; and (2) 2,(Q,) is the direct product of T and K if and only if T 
is a normal subgroup of 2,(Q)). 

A part of the above results overlap with an unpublished work of S. 
Wylie, who has obtained a few theorems concerning the structure of the 
fundamental group of mapping spaces, by a quite different approach with 
some geometric methods. 


2. Preliminaries. [Let £" denote the r-dimensional parallelotope in the 
euclidean r-space defined by 


where denotes the coordinates of an arbitrary point in the 
euclidean r-space. Denote by £," and FE," the closed subsets of H* defined 
respectively by the inequalities e; S$ and e,; => 3. 

An element ae7,(¥Y) is represented by a mapping f: #"—Y which 
maps the boundary 0" into the point y.. Two such mappings represent the 
same element of z,-(Y) if and only if they are homotopic relative to dH". 
Suppose a, Bez2,(Y) be respectively represented by the mappings 


fg: Y, = yo = 9 (0k"). 


Then the element «f-'e7,(¥Y) is represented by the mapping h: E7—>Y 
defined by 


e 

1 

g 

te 
ly 

Q 
1S 

ot 

Q 
BS 
? 

h 

is 

h 

h 

) 
al 


576 SZE-TSEN HU. 


f (2e1, €2,° .€r) (oX<e,<})- 
(é1, C25 ) * er) 


Now, let J denote the closed interval (0,1) of real numbers and let o 
denote an arbitrary n-simplex with a fixed orientation. Then the topological 
product 3 =o X E* XI is an oriented (n+ 1)-dimensional geometric 
cell. Let 

Xo = X OE" XI) U (0 X E* X11). 
Clearly 3 is contained in the boundary @3 of &. 


Let us consider an arbitrarily given mapping 
02> Y, = Yo. 


Since = is nonvacuous and connected, ¢ determines a unique element 
($,%) €ansr(Y) which depends only on the homotopy class [¢] relative to 
>» and the orientation of 3. For two arbitrary mappings 


let us define a mapping x: 03—> Y as follows. For each point 
(2, €1, C2, ° er, t) e023 C 


where 
LEO, (€1, ° -,er) tel, 
we take 
X (2, €2,° er, t) = way r,t), ( = 1 =), 


(2, 2 — *yer,t), ($4 1)... 


Then it is clear that x(3.) and (x, 3%) = (¢,3) — (y¥, 3%). Here we 
have used the additive notation for the group operation in mn,,(Y), because 
in the sequel we shall consider only the simplexes o e X with dimension n ” 0 
and hence ansr(¥) is abelian. 

E* can be considered as a cell complex with an r-cell Z* and a number 
of cells with dimensions less than r on its boundary. Then the topological pro- 
duct A =X is also a cell complex. Let Ay = (X X OE") (Xo X E*), 
then A, is a closed subcomplex of A and its complement ee Ay an 
open subcomplex of A. 

According to R. H. Fox [4], the mappings f: ET ->Q), f(@E") =0 are 
in a (1-1)-correspondence (which preserves homotopy) with the mappings 
F: A—Y, F(Av) =Yo. Hence, an element ae7,(Q)) is represented by 
such a mapping /’; and two such mappings represent the same element of 


Z 


in 


| 
sh 
ce 


al 


nt 
to 


HOMOTOPY GROUPS OF MAPPING SPACES. 577. 


mr(Q) if and only if they are homotopic relative to Ao. If two arbitrary 
elements a, 8 €2,(Q) be represented by the mappings F,G: A—Y, F(Ao) 
=Yo=G(A,), then the element af is represented by the mapping 
H: A—>¥Y defined for each xe X and each -,er) eH": 


H (2, €2)° = P(X, 21, €2,° er), (ose 3), 


In the remainder of the present section, we shall give a trivial generali- 
zation of a theorem due to N. E. Steenrod [8, p. 316] which will be used 
in the sequel. Let G denote an arbitrary abelian group. 


(2.1) The correspondence o X E*->o defines an tsomorphic cochain 
mapping tr: A+—>X+ and thereby induces isomorphisms onto: 


i: G) H"(X,, G), n= 0. 


Proof. That is one-to-one is trivial. Since the coboundary = 0, 
we have 8(0 X E") =8c X EH". Hence «, commutes with the coboundary 
operator 8, i.e. u is a cochain mapping. Q. E. D. 

Now, let 

1, 
Bo= (A X 0) U (40 XL) U (A X11), B. = B— Bo. 
According to N. E. Steenrod [8, p. 316], the cochain mapping 
induces isomorphisms onto: 
H1(B. G) (As, G), n= 0. 
In the sequel, we shall make use of the combined isomorphisms: 


G) H*(X,, G), n=0. 


The notations of this section will be used for the whole paper. 


3. Some cohomology invariants. Following N. E. Steenrod [8], we 
shall put 
= A, LU Be = B, B, (p=0), 


where A? (B?) denotes the p-dimensional skeleton of A (B), i.e. the set of 
cells of A (B) with dimensions not exceeding p. Let 


ye 
se 
Cc 
ar 
n 


578 SZE-TSEN HU. 


B, = (Ao XI) U (A X11). 
Consider an arbitrary mapping 
Bur Y, n> 0. 


For an arbitrary (n+ r-+1)-cell =o X E" X Ie Bs, oe Xs, clearly we 
have X X 1) U (o X E” X 1)] = yo. Hence, according to 2, f | ds; 
determines a unique element (f, si) €mn.r(Y). By the argument of S. Eilen- 
berg [3, p. 237], the cochain c"*"*(f) = 3i(f,si)s8i is a cocycle in Bs. The 
cocycle c"*"*1(f) represents an element y"*"*'(f) of the cohomology group 
H™***1( Bs, an.,), which depends only on the homotopy class [f] of f relative 
to B, The element y"*"*!(f) is said to be presented by f, and f a presenter 
of y"*"*1(f). Such elements are called presentable elements of Bs, 
If a, Be Bs, be arbitrary presentable elements of H"*"*? (Bs, 
which are presented by the mappings 


Boro Y, f(B:) =y.=9(Bi), 


then the element «-f is clearly presented by the mapping h: B*r — Y defined 
by 
f(z; €2,° *, r,t), (oSe,S}), 


for each point (2, ¢:, er, t)e Br C B, where re X, er) 
and teZ. Hence we have proved the following theorem. 


(3.1) The presentable elements of the cohomology group H"*"**(Bz, mnsr) 
form a subgroup P**'!(Bs,ansr), called the presentable subgroup of 
(B., Tnsr) 


A mapping f: —»Y is said to be regular in case f(By) = Yo. An 
element of H"*t*!(Bs,an.r) is said to be regular if it can be presented by a 
regular mapping. For two arbitrary regular mappings f, 9: Bur +» Y, the 
mapping h defined above is obviously regular. Hence, we have the following 


theorem. 
(3.2) The regular elements of the cohomology group H't*}( Bs, mn) 
form a subgroup R"**1(Bs, called the regular subgroup of H"*t**(Bs, 


Let us denote by P"(Xs;mn.r) and R"(X+,an.r) the images of 
(Bs, and under the isomorphism 


(Bs, Tnsr) —> H"(Xs, Tnar) 


d 


fo 


Le 


for 
an 


fe 
Tre 
48 
al 
a 
( 
S 
al 
a 


of 


HOMOTOPY GROUPS OF MAPPING SPACES. 579 


described in 2, and define 
HH." (Xs, = P.( Xz, Tner) — Rn (Xs, nar) 


for each n=1,2,---,m. The groups P"(X+,an,,) and R"(X+,mn.r) are 
respectively called the presentable and the regular subgroup of the cohomology 
group H"(X+,7an,-), and their elements are called the presentable and the 
regular elements of the same. 


(3.3) If H?*(Xs, =0 for each l<p<n, then H."(Xs, ts 
tsomorphic with P"( Xs, mn.r). 


Proof. We need only to prove that R"*'*( Bs, mr) =0. Let a be an 
arbitrary regular element of H"*’*1( Bs, wn,,) ; then, by definition, there exists 
a mapping f: such that f(B)) and y"™(f) =a. Every 
(r+ 1)-cell of Bs is of the form s; = 2; XK H" X I, where x; is a vertex of X.. 
Since the boundary ds; of s; is in Bo, the partial mapping f | s; defines an 
element i € 7r4:(1'). Since X is connected, all the 8; are equal to a fixed 
Clearly 8 =0 if is nonvacuous. Choose a representative 


@: U XT) U (E"X1)] =%, 
for the element £. 
Define a mapping g*: B— Y by taking 
g* (x, e, t) = t), (ze X, ec tel). 
Let g=g*| Br. Since g has g* as an extension over B, we have 
y""1(g) =0. ‘Define a mapping h: Y by taking 


h r t == 
(2, @1, C2, er, ) g(x, 2 — €2,° 


for each point (2, ér, t) € Burc B, where re X, @2,° er) e E’, 
and te J. Clearly we have h(B,) and 


(h ) _ (f) (9) = 


Now, for each (r+ 1)-cell sj ¢ Bs, the partial mapping h | s; represents 
the element 8 — 8 = 0 of the homotopy group ar.:(Y). Hence there exists 
a homotopy (0 St=1) such that hp =h, h,(Br*?) = yo, and 
ht (Bo) = yo for each oS 

Since, for each p= 2,3,- - -,n—1, we have 


(Bs, Tpsr) ~ Tpsr) = 0, 


Si 
ye 
er 
). 
) 
r) 
n 
e 
-) 
). 


580 SZE-TSEN HU. 


it follows from the successive application of the first homotopy theorem of 
S. Eilenberg [3, p. 240] that there exists a homotopy kt: Bur + Y 
such that ky = yo, and = yo for each 
oSt=1. Since k, | B*r-1 can be extended throughout B, it follows from 
the first extension theorem of S. EKilenberg [3, p. 239] that 


yr (hy) (Is) = 0. 
Hence R"***1(Bs, = 0. This completes the proof. 


(3.4) If He (Xs, apr) = 0 for each n<p<m, then P"(Xs, ans) 
coincides with H"(X+,anr)- 


Proof. We need only to prove that P"*'**(Bs, = H™?** (Bs, 
Let c be an arbitrary cocycle in B. with coefficients in mn,r. It can be easily 
seen that there exists a mapping 


Fy: X 0) U XT) U (AX 1) 
such that 
F,[ 4 1) U (A 1)] Yos Fy) 


Define a mapping A"** > Y by taking f.(a) Fy(a,0), (ae A™"). Since 
c is a cocycle in B., f, admits an extension f,: A"*"*?-» Y. f, determines an 
element y"*"*?(f,) H™?*?(As, Since 


As, Tnarst) ~ H"*?(X., Tnsr+1) 0, 


we have y"*r+2(f,) =0. It follows from the first extension theorem of S. 
Eilenberg [3, p. 239] that fy has an extension f,: 4"*"*?—» Y. By successive 
application of this argument, one can prove that f, admits an extension 
f: A> Y. Extend F, toa mapping F: Y by taking F'(a,0) =f (a), 
(ae A). Hence we have c—""*1(F), i.e. every element of the group 
H"*"*?(Bs,mnsr) is presentable. This completes the proof. 


4. The n-trivial subgroup. We have seen in 2 that the elements of 
the homotopy group 7,(Q)) ate the homotopy classes relative to A» of the 
mappings f: A— Y with f(Ao) = Yo. 

An element ae7,;(Q) is said to be n-trivial (n2o) if it can be 
represented by a mapping f: Y such that f(A") =y. The n-trivial 
elements of z;(Q,) obviously form a subgroup Ky, of 2;(Qo), called the n-trivial 
subgroup of 7-(Q)). Thus, we have a decreasing sequence of subgroups of 


Tr (Qo) 


ta 


Si 
pr 
su 
Te; 


de 


wh 


| 

| 


ch 


ym. 


ce 
an 


HOMOTOPY GROUPS OF MAPPING SPACES. 581 


where the last group Km clearly consists of a single element. 

Let (n> 0) be an arbitrary (n —1)-trivial element of 
and let f: A— Y, f(A"*"-') = yp, be a representative of « Define a mapping 
F: Bt by taking 


f(a) (ae A, t =0) 
F(a, t) =2 yo (ae tel) 
Yo (ae A, t=—=1). 


Since F'(B,) = yo, F presents a presentable element 


Next, let g: A— Y, g(A”*"-1) = yo, be another representative of «, and 
G: Brtr—»Y be similarly defined for g. Define a mapping ®: B"**-+Y by 
taking 
F(a, 21, @r, t): (oSe,S}), 
(2, €2,° ery t) = 


for each point (2, t)e Berc B, where xe X, +, er) 

and Clearly y"*"*1(®) = Define a mapping 

¢: by taking ¢(a)—®(a,0), (ae A). Since ¢ represents the 

element aa-! == 1 of there exists a homotopy ¢:: Y, (0 St=1), 

such that ¢:(A) = yo, and $+(Ao) = Y for each oS¢t S11. 
Define a homotopy By > Y (oSA=1) by taking 


(aeA, 
®)(a,t) = 2 Yo (ae Ao, tel) 
Yo (ae A, t= 1). 


Since @,—@ is defined over B", it follows from the homotopy extension 
property [2, p. 501] that ®, has an extension Y, 
such that = ©. Since ©,*(By) = yo, hence y"*"*1(®) = y"*"*1(@o*) is a 
regular element, determines an element of which 


depends only on @ and will be denoted by jin(a). 


(4.1) The correspondence a—>pn(a) is a homomorphism 
Bn | om Hu" (Xs, Tnar) 


of the (n—1)-trivial subgroup mr(Qo) onto the group H." (Xs, 
whose kernel is the n-trivial subgroup Ky. 


of 

y ‘ 

r) 

| 

S. 

ve | 

p 

of | 

pe 

| 

al 

| 


582 SZE-TSEN HU. 


Proof. That », is a homomorphism is clear. To prove that p,» is onto, 
let €e P™*"*'(Bs,2n.r) be arbitrarily given. By definition, there exists a 
mapping F: Br+Y, F(B,) such that =€ Define a 
homotopy 
Fy: XI) U (AX 1) Y (oSAS1) 
by taking 


Wile; t) F(a,A +t—At) (ae A, tel), 
Yo (aeA, t= TJ), 


Since Fy) = F is defined over B"*’, it follows from the homotopy extension 
property [2, p. 501] that Fy has an extension F*,: (0oSAS1) 
such that Clearly we have F*,[(A"™"? XI) U (AX1)] 
and F*,(B,) =y for each oSAS1. Define a mapping ¢: A> Y by 
taking ¢(a) = F*,(a,0), (ae A). Since ¢(A"*"-!) = y, represents an 
element ae Ky_;. It is obvious that is the coset which 
contains the image under . of the element 


This proves that pu» is onto. 

To consider the kernel of pn, suppose ae Ky, and pa(a) =0. Let 
f: AY, f(A") = yo, be a representative of and define F: Bt + Y 
as above. Then the element y"*"*'(F’) is regular; and hence there exists a 
mapping G: Y, G(Bo) = yo, such that = y"""(F). Define 
a mapping ®: Y by taking 

2€1, er, t) (0o=¢,5}), 

® % ryt 


for each point (2, t)€ Br C B, where eX, +, er) 
teZ. Then (B,) = and 


It follows from the first extension theorem of S. Eilenberg [3, p. 239] that 
the partial mapping © | Br-1 has an extension @*: Bur1—» Y, Define a 


homotopy 
by taking 
6*(a,A +t—At) (ae A”, tel), 
= (ae A, t=1). 


th 
Bn 


f 
f 

| 

is 
K 
K, 
| De 
| 
| of 
de 
| tub 
oth 


at 


HOMOTOPY GROUPS OF MAPPING SPACES. 583 


Since W) = &* is defined over Bore, it follows from the homotopy extension 
property [2, p. 501] that has an extension ¥*,: Y, (oS AS 1), 
such that ¥*,—*. Define a homotopy f:: A> Y (oStS1) by taking 
fri(a) = (ae A,oStSl). Since G(A X 0) and ¥*, = o*, 
fo clearly represents the same element @ as f does. Since ¥*,(A"*" X I) = yo, 
we have f;(A"") Further, it follows from W*;(B,) —y, that 
ft(Ao) = Yo for each o=t=1. Hence ag is n-trivial. 

Conversely, suppose «eK, be represented by a mapping f:; A> Y, 
=yo. Define F: Y by taking 


f(a) (aed, 
F(a, t) = 4 yo (ae tel), 
Yo (aeA, 


then clearly =0 and hence Therefore, the kernel of 
vn is the n-trivial subgroup K,. This completes the proof of (4.1). 


(4.2) Hach group of the sequence 


is a normal subgroup of the preceding and each of the quotient groups 
Kn-+s/Kn, (n=1,2,° -,m), ts abelian and isomorphic with H." rnsr)- 
Hence, the subgroups K, (n=0,1,--+,m) are solvable groups [7, p. 15]. 


Proof. Since Ky is the kernel of wa, it is a normal subgroup of Kn-1. 
Since pn is onto, it follows from Noether’s theorem that Kn_,/Kn 
= He" (Xe, Since is abelian, the commutator subgroup of 
Kn.1 is contained in K,. Hence it can be easily seen that K,_, is solvable. 
Q. E. D. 


5. The tubular homomorphism. Let a¢7,(Q) be an arbitrary element, 
and f: A— Y, f(Ao) = ¥Y, be a representative of « Choose a point x eX. 
Define a mapping ¢: E"— Y by taking ¢(e) =f(%,e), (ee H"). Since 
(0E") = yo, @ represents an element ge7,-(Y). Since X is connected, é 
does not depend on the choice of x. Further, it is clear that é is independent 
of the choice of the representative f. Hence € depends only on « and will be 
denoted by 7,(a). 


(5.1) The correspondence «—>7,(a) is a homomorphism (called the 
tubular homomorphism mr(Qo) >ar(Y). tr ts onto tf is vacuous; 
otherwise, tr maps m,(Qo) into the identity element of m-(Y). The kernel K 
of rr ts the o-trivial subgroup Ko of mr(Qo). 


| 


584 SZE-TSEN HU. 


Proof. That r, is a homomorphism is trivial. If Xo is non-empty, we 
may choose hence ¢(H") and 7,(a@) for each element 
%€m,r(Q). Suppose be empty. Let an arbitrary element he 
represented by ¢: Y with $(0E") Define a mapping f: A> Y 
by taking f(z,e) =¢(e), (ce X,eeL).. Since is void, Ay =X X GF’; 
hence we have f(Ao) =Yo. -f represents an element gem,(Q). Clearly 
tr(a) =€. Therefore, is onto. 

To consider the kernel of 7,, let us assume ) with 
Let f: f(Ao) be a representative of For each r-cell 
8; = 2% X E*, where 2; is a vertex of X-, the partial mapping f | s; represents 
the element 7,(@) =1lez,(Y). Hence, it follows from the homotopy exten- 
sion property [2, p. 501] that there exists a homotopy f:: A—> Y, (0 St 1), 
such that fo =f, f:(A") = yo, and f:(Ao) for each oS ¢1. There- 
fore, ae Ko. 

Conversely, suppose ae K, and f: f(A") =yo, be a represen- 
tative of a. If we choose 2 to be a vertex of X, we see that r-(a) —1. 
Hence the kernel of +, is K = Ky. This completes the proof. 

Hereafter, the kernel K = Ky of the tubular homomorphism +, will be 
called the principal subgroup of 7-(Q). The following statements are imme- 
diate consequences of (5.1). 


(5. 2) If Xo is non-empty, then m,(Q.) = Kp is a solvable group. 


(5.3) If Xo is empty, then K = Ky is a normal subgroup of mr(Q) 
and tr(Q)/K ar(Y). 


In the remainder of this section, we shall assume that X, be empty. 

An element «¢7,(Q) is said to be tubular if it can be represented by 
a mapping f: A— Y, f(Ao) = Yo, such that f(x, ¢) =f (%2,e) for arbitrary 
points 2,, X, ee HL". The tubular elements of z,(Q,) clearly form a sub- 
group T, called the tubular subgroup of 2,r(Q). 


(5.4) The tubular homomorphism +, maps the tubular subgroup T of 
mr(Q.) isomorphically onto mr(Y). 


Proof. That +, maps T onto z,-(Y) has been proved in the first para- 
graph of the proof of (5.1). To prove that +, maps T isomorphically, let 
us suppose that ae T and z,(«) =1. Let f: A—>Y be a representative of a 
as described above. Choose a point z)¢X and define a mapping ¢: EL’ > Y 
by taking ¢(e) =f(%o,e) for each ee £’. Since ¢ represents the identity 
element of there exists a homotopy (0 St=1), such 


| 


HOMOTOPY GROUPS OF MAPPING SPACES. 585 


that ¢o = ¢, = y, and = y for each oSt1. Define a 
homotopy f:: A— Y, (0 StS 1), by taking f;(z, ¢) = de (e) for each ee X, 
ee and eacho [¢t=1. Then we have f, == f, f,(A) = yo, and fr(Ao) = % 
for each oS Hence, and the proof is complete. 


(5.5) m(Qo) is the direct product of its tubular subgroup T and its 
principal subgroup K if and only if T is a normal subgroup of 2-(Q). 


Proof. The necessity is trivial. To prove the sufficiency, let us suppose 
T to be a normal subgroup of 2-(Q)). Then both T and K are normal 
subgroups of a-(Q). Let wea,-(Q) be an arbitrary element, and let 
§=1,(«) ea-(¥Y). Since maps T isomorphically onto ar(Y), there exists 
a unique element such that r,(8) —é. Then the element = is 
contained in the principal subgroup K ; and we have a = 88, i.e. 7-(Q.) = TK. 
Further, since +r, maps 7’ isomorphically and K is the kernel of +,, we have 
T(\K=1. Hence, z-(Q)) is the direct product of T and K, [%, p. 16]. 
This completes the proof. | a, 

Since 2,(Q,) is abelian when r > 1, we have the following corollary of 
(5.5). 

(5.6) For each r >1, m,-(Qo) ts the direct sum of T and K. 


Thus we have proved all the results sketched in the introduction. 


6. Specializations. In the present section, we shall deduce some inter- 
esting particular cases of our general results concerning the structure of the 
principal subgroup K of 2,(Q). 

(6.1) The principal subgroup K of w,(Qo) ts isomorphic with the co- 
homology group H"(X+,mn.r) if the following conditions are satisfied: 

(i) =0 for each l<p<n, 

(ii) H?*(Xs, =0 for each n< p<m, 

(iii) H?(Xs, = 0 except p=0 and p=—n. 

Proof. According to (3.3), the condition (i) implies that H." (Xs, mnsr) 
is isomorphic with P"(X+, nr). According to (3.4), the condition (ii) 
implies that P"(X+,n,.,) coincides with H"(X+,an,,). Hence we obtain the 


isomorphism 
He" (Xs, Tner) ~ H" (Xs, Tar) 


It follows from (iii) that H.?(X+,an.,) =0 for each p>O and pn. 
Hence K is isomorphic with the cohomology group H"(X+, mur). Q.E. D. 


we 
nt 
he 
Y 
ell 
its | 
n- | 
», | 
| 
be 

o) 

| 
by 
Ty 
b- 
of | 
| 
et | 
Y 
ty 


586 SZE-TSEN HU. 


As a corollary of (6.1), we give the following theorem. 


(6.2) If m(¥Y) =0 for each r<p<m-+r, then the principal sud- 
group K of wr(Qo) ts tsomorphic with the cohomology group H™(Xs, mmr). 


(6.3) If the integral homology groups H,(X+) =0 for eacho<p<m, 
then the principal subgroup K of m-(Qo) ts isomorphic with the cohomology 
group H™ (Xs, mmr). 


Proof. Since H,(X+) =0 for each 0 < p< m, it follows from the first 
duality theorem [6, p. 117] that the integral cohomology groups H?(X.) = 0, 
(0o<p<m), and the m-dimensional cotorsion group T”™(X+) =0. Then 
it follows from a theorem of Eilenberg and MacLane [6, p. 347] that 
G) =0, (0 <p <_m), for each discrete coefficient group G. Hence 
the conditions (i)-(iii) of (6.1) are satisfied with nm. This completes 
the proof. 

In particular, if we put Y —S™”, the m-sphere, and X, either vacuous 
or a single point, then (5.5), (5.6), (6.3) include the results of M. Abe [1] 
and the author [5]. 


INSTITUTE OF MATHEMATICS, ACADEMIA SINICA, 
NANKING, CHINA. 


BIBLIOGRAPHY. 


[1] Abe, M., “ Ueber die stetigen Abbildungen der n-Sphire in einen metrischen Raum,” 
Japanese Journal of Mathematics, vol. 16 (1939), pp. 169-176. 

[2] P. Alexandroff und H. Hopf, Topologie I (1935). 

[3] S. Eilenberg, “Cohomology and continuous mappings,” Annals of Mathematics, 
vol. 41 (1940), pp. 231-251. 

[4] R. H. Fox, “On topologies for function spaces,” Bulletin of the American Mathe- 
matical Society, vol. 51 (1945), pp. 429-432. 

[5] S. T. Hu, “On spherical mappings in a metric space,” Annals of Mathematics, 
vol. 48 (1947), pp. 717-734. 

[6] S. Lefschetz, Algebraic topology, American Mathematical Society Colloquium Pub- 
lications, New York (1942). ° 

{7] L. Pontrjagin, Topological groups, Princeton University Press (1939). 

[8] N. E. Steenrod, “Products of cocycles and extension of mappings,” Annals of 
Mathematics, vol. 48 (1947), pp. 290-320. 


in. 
di 
(1 
‘ (0 
(2 
wl 
| are 

lin 
ac 
inf 
OSC 
(3 
wh 

(3 
Th 
Stu 
(4) 

In 


ocr rw 


of 


A PRIORI LAPLACE TRANSFORMATIONS OF LINEAR 
DIFFERENTIAL EQUATIONS.* 


By AuREL WINTNER. 


1. The results of the following considerations are in the direction 
initiated in [9] and applied in [10]. They deal with the solution of certain 
differential equations 


(1) x” + f(t)e=0 
(on an unbounded interval 
(2) 


with an unspecified /*) by means of definite integrals of specific type. 


The underlying idea is similar to, but the assumptions and the results 
are substantially distinct from, those in [9]. Correspondingly, [9] will not 
be needed. 


2.. If f(¢) is a real-valued, continuous function on an unspecified half- 
line (2), the differential equation (1) is called oscillatory or non-oscillatory 
according as every or no real-valued solution a(t) #0 changes sign an 
infinity of times as {—> o. In view of Sturm’s separation theorem, (1) 
must be non-oscillatory if it is not oscillatory. 

As observed by Kneser [3], the differential equation (1) cannot be 
oscillatory if 
(3) — © Slimsup f(t) < 4, 


whereas it must be oscillatory if 
(3 bis) 4 < lim inf (f(t) S 0. 
The proof of these criteria follows immediately, if (1) is “compared,” in 
Sturm’s sense, with the trivial differential equation 
(4) + = 0. 


In fact, substitution of 
== 


* Received September 17, 1948. 


7 


t 

8, 

S, 

587 
| 


588 AUREL WINTNER. 


into (4) gives 
24 = 1+ (1— 


and the last two formula lines show that (4) is oscillatory when c > } and 
non-oscillatory when c < 4. 

In view of Sturm’s comparison theorem, “ logarithmic’ 
the criteria (3), (3 bis) are corollaries of the asymptotic formulae of Hartman 
[1] for the solutions of (1) when f(t) is an arbitrary logarithmico-exponential 
function. A criterion of quite another, though still explicit, type was given 
in [12]. 

None of these explicit criteria supplies a condition which is necessary 
and sufficient in order that (1) be oscillatory (or non-oscillatory). <A 
condition which is both necessary and sufficient (but, instead of being 
“explicit,” is of the “ Lebesgue-Toeplitz” type) was given in [11]. 


> refinements of 


3. The content of the theorem to be proved below can be illustrated 
by the following corollary of it: 


If 3 ¢nz" converges everywhere, and if 
n=1 


(5) 4¢,<1 
and 
(6) Cn = 0 (n = 1,2,---), 


then the linear differential equation 
oo 

(7) + (—1)"*en/t" = 0 
n=1 


has, for large positive t, two linearly independent solutions representable in 
the form 


with 


(Sbis) (—i)"d"log a(t) /dt" =— ( dg(s), (n> 0), 


where (s) is a non-decreasing function, the convergence of (8), (8bis) on 
the half-line th <t < @ is part of the statement, and t, and the monotone 
function $(s) depend on the pair of integration constants determining a 
solution x(t) of (7). 


t 
5 
h 
te 

1s 
fo 

0 (1 

im 
(1 


in 


ne 


LAPLACE TRANSFORMATIONS OF DIFFERENTIAL EQUATIONS. 589 


The parentheses of the integral signs in (8), (8bis) refer to Abelian 
“summations”; so that the convergence of 


f e*g(s)ds 


0 


means that the integral 


converges for every « > 0 and tends to a finite limit as e—> 0. 
If ¢ is replaced by ¢i, then (7) goes over into the case 


oO 
f(t) = 

n=1 
of (1). In view of (5), this f(¢) satisfies (3) and, by virtue of (6), is such 
that the inequality 
(9x) (—1)*D*f(t) 0, where 0,1,---, (Dt = d*/dt*), 
holds on the half-line 0<t< o. Accordingly, (1) is non-oscillatory and 
its coefficient function is completely monotone. 


4. If this situation is compared with the substitution, t—> ti, applied 
to (7), it is seen that the last italicized statement can be obtained by 
replacing ¢ by ¢ + te, and then letting « (>0) tend to 0, in the assertion, 
(12), of the following theorem: 


On some half-line (2), let f(t) be a function possessing arbitrarily high 
derivatives which satisfy the inequalities (9). Suppose further that f(t). 
is such as to render the differential equation (1) non-oscillatory (e.g., that 
(3) is satisfied by f). Then there exists on (2) a t=T having the 
following property: There belongs to every t, satisfying 


(10) (7 > t*) 
a solution of (1) representable on the half-line 
(11) 


in the form 


(12) = exp— ff 


of 0 
ial 
A 
ng 
| 
fe @) 
0 


590 AUREL WINTNER. 


where $(s) depends on to, ts a non-decreasing function of s, 1. e., 


(13) do¢(s) 20 for OSs < 


and is such that 


(14) f < 0 ife >0 
0 
but 
(15) f dp(s) = 0; 
so that, by (12), ° 
(14 bis) z(t) ~0 on (11) 
but 
(15 bis) a(t) =0. 
The proof will depend on the replacement of (1) by 

(16) y + f(t) =0, 
the Riccati resolvent of (1). It is satisfied by the logarithmic derivative, 
(17) y(t) 


of every solution 7(t) 0 of (1). If ¢) is large enough, the denominator 
of (1%) does not vanish on (11), since (1) is supposed to be non-oscillatory. 


5. Let a(t) be any real-valued, non-trivial (+0) solution of (1). 


Since z(t) can be replaced by — z(t), and since (1) is non-oscillatory, it 
can be assumed that 


(18) x(t) >0 


on (11), provided the ¢) defining (11) is chosen large enough. It will be 
assumed that fy is fixed in accordance with this proviso, and that ¢ is restricted 


to the corresponding half-line (11). 


According to (9), the function f(¢) is non-negative. It follows there- | 


fore from (1) and (18) that 
(18 bis) S0. 


On the other hand, it is clear, for reasons of convexity, that (18) and (18 bis) 
cannot hold on the entire half-line (11) unless x(t) is non-decreasing, hence 
z(t) 20, on (11). It follows therefore from (17) and (18) that y(t) = 0. 


TH 


hc 


(1 


on 
ho 


— 


p= 


) 


LAPLACE TRANSFORMATIONS OF DIFFERENTIAL EQUATIONS. 591 
This proves that, if D" = d"/dt", the case n 0 of the inequality 
(19n) (—1)"D*y(t) 20 


is true on (11). Since f(¢) is non-negative, (16) shows that (19,) holds 
for n=1 also. In order to prove that (19,) is true for every n, it will be 
assumed that 


(20) (190), (191),° (19%) are true, 


and it will be shown that (20) implies (19,,,) by virtue of (9;). 
First, if the binomial coefficients are denoted by C=C(k,j), it is 
seen from (16), by k-fold differentiation, that 


k-1 
— = + 23 C(k —1, 7) (Dy) (Dery). 
j=0 
It follows therefore from (9) that (19;,,) is true if 
k-1 
(— 1)* 3 C(k—1, j) (Diy) (Diy) = 0. 
gj=0 


Since C(k—1,/) > 0, it follows that it is sufficient to ascertain the truth 
of the & inequalities 


(— 1)*( Di*ty) = 0, where OS j Sk —1. 
But these & inequalities are implied by the k + 1 assumptions (20). 


6. This proves that (19,) holds on (11) for every n. In other words, 
y(t) is completely monotone on (11), i.e., y(é-+ ¢)) is completely monotone 
on the half-line 0< ¢< o. Hence, the Hausdorff-Bernstein theorem sup- 


_ plies the existence of a function ¢(s) satisfying (13) and having the property 


that 


co 


y(t-+ 


0 


holds on the half-line 0< t< ow, i.e. 
oO 


(12 bis) y(t) = f e~ (t-te) 8g (s) 


0 


on the half-line (11). The existence of an x(t) represented on (11) by (12) 
now follows from (17). 


| 
it 
|__| 
| 


592 AUREL WINTNER. 


The specification (15) was in no sense involved thus far. In fact, (12) 
and (13) show that (15) is equivalent to (15 bis), whereas the requirement 
imposed above on ¢) was nothing like (15 bis), but merely the proviso that 
to is large enough. 

In order to satisfy (15 bis), it is sufficient to combine the latter proviso 
with the fact that 7(¢t°) =0 (along with an arbitrary 2’(t°) #0) can be 
assigned as an initial condition for a solution 2(¢) #0 of (1). 


7. In order to complete the proof of the last italicized theorem, it is 
sufficient to apply the following remark: 


If a differential equation (1), in which f(t) denotes a real-valued, 
continuous function for large positive t, is non-oscillatory, then there exists 
a T having the property that no (real-valued, non-trivial) solution x(t) of 
(1) has more than one zero on the half-line 


Suppose that there does not exist such a JT. Then (1) has (real-valued, 
non-trivial) solutions 2,(¢),%2(t),- - - the n-th of which has the property 
that =0 and =0, where tn << Un, and o as n> 0. 
But Sturm’s separation theorem claims that 2,(¢) must have a zero on the 
interval u,StSv,. It follows therefore from u,— o that the zeros of 
z(t) cluster at t= ©. This contradicts the assumption that (1) is non- 
oscillatory. 

APPENDIX. 


The considerations of [9] depended on the following elementary fact 
which, in a slightly different formulation, is due to Kneser [4]: 


If f(t) is real-valued, continuous end non-positive on (2), then (1) has . 


a solution satisfying 
(21) x(t) > 0 and a’(t) =0 
on (1). 


_ This fact leads, again in the direction of [9], to a curious result, which 
can be formulated as follows: 


If f(t) is real-valued, non-positive and continuous for OSt < «, then 
(1) has a solution representable in the form 


(22) x(t) = const. + f X(s) costsds for0 <t< 
0 


cf 


| 
(l 
( 
t 
| 
le 
( 
il 
] 
0 
se 
( 
al 
lit 


is 


t 


LAPLACE TRANSFORMATIONS OF DIFFERENTIAL EQUATIONS. 593 


where 

(23) const. = lim 2z(¢) 

and 

(24) X(s) > 0. 


The existence of the limit (23), which is part of the statements, cannot 
be concluded from the balance of the statements. In fact, (23) can be 
concluded from (22) only if the integral analogue of the Riemann-Lebesgue 
lemma is applicable to the Fourier transform occurring in (22). In view of 
(24), this is the case only if X(s) is absolutely integrable over 0s < ©. 
But this is not claimed, since t= 0 is excluded in (22). 

It may be mentioned that, in (22), 


co 
(25) const. = 0 if and only if f tf(t)dt = o 
0 


in (1), where f= 0 (for a simple proof of the characterization (25) of the 
vanishing of the limit (23), cf. [2], Appendix; the method applied there 
leads to a result sharper than what is claimed by the first of the assertions 
of (25)). 

First, if y(t), where 0 =t < oo, is any function possessing a continuous 
second derivative and satisfying 


(26) y(t) >0,  ¥(t)S0, y(t) 20 
and if, in addition to (26), 
(27) y(t) as 


then the Fourier cosine transform, 


(28) f y(s) cos ts ds, 


of y(t) is convergent, and represents a non-negative function on the half- 
line 0 < t< co. This follows, by a partial integration, from the fact that 


(—1) "an 20 if 0; 
n=1 


cf. [5], or [6], p. 378, where (27) is omitted. 
Next, if a(t) is a solution of (1) supplied, if f=0, by Kmeser’s 


it 
at 
30 
e 
i, 
ts 
1, 
yf 
l- 
| 
0 


594 AUREL WINTNER. 


theorem, then, since (21) holds on the half-line 0S ¢ < o, the function 
z(t) is positive and tends non-increasingly to a limit x(o). Furthermore, 
x’(t) = 0, by (1), where f(t) [0 and x(t) >0. Consequently, (27) and 
all three conditions (26) are satisfied by 


(29) y(t) = a(t) —2(o), 


where < o. 

The last italicized theorem follows by substituting (29) into (28) and 
applying a Fourier inversion. In fact, (27) and (26) together are more 
than sufficient for the applicability of a standard criterion for the legitimacy 
of Fourier’s inversion (if t 40); ef. [7]. 


THE JOHNS HopKINS UNIVERSITY. 


REFERENCES. 


{1] P. Hartman, “On the linear logarithmico-exponential differential equation of the 
second order,” American Journal of Mathematics, vol. 70 (1948), pp. 765-779. 

and A. Wintner, “On the Laplace-Fourier transcendents,” ibid., vol. 71 
(1949), pp. 367-372. 

{3] A. Kneser, “ Untersuchungen iiber die reellen Nullstellen der Integrale linearer 
Differentialgleichungssysteme,” Mathematische Annalen, vol. 42 (1893), pp. 
409-435. 

, “Untersuchung und asymptotische Darstellung der Integrale gewisser 

Differentialgleichungen bei grossen reellen Werthen des Arguments, I,” 

Journal fiir die reine und angewandte Mathematik, vol. 116 (1896), pp. 

178-212. 

[5] A. Lindhagen, Studier 6fver Gamma-Funktionen och nagra beslagtade Transcen- 
denter, Uppsala, 1887;. quoted from [8], p. 184. 

[6] G. Pélya, “Ueber die Nullstellen gewisser ganzer Funktionen,’ Mathematische 
Zeitschrift, vol. 2 (1918), pp. 352-383. 

{7] A. Pringsheim, “ Ueber neue Giiltigkeitsbedingungen fiir die Fouriersche Integral- 
formal,” Mathematische Annalen, vol. 68 (1910), pp. 367-408; Nachtrag, 
ibid., vol. 71 (1912), pp. 289-298. 

[8] J. F. Steffensen, “ Bounds of certain trigonometrical integrals,” Tenth Scandi- 
navian Congress of Mathematicians, Copenhagen, 1947, pp. 181-186. 

[9] A. Wintner, “On the Laplace-Fourier transcendents occurring in mathematical 
physics,” American Journal of Mathematics, vol. 69 (1947), pp. 87-98. 

, “On the normalization of characteristic differentials in continuous 

spectra,” Physical Review, vol. 72 (1947), pp. 516-517. 

, “A norm criterion for non-oscillatory differential equations,” Quarterly 


[2] 


[4] 


of Applied Mathematics, vol. 6 (1948), pp. 183-185. 
, “A criterion of oscillatory stability,” ibid., vol. 7 (1949), pp. 115-117. 


] 
i 
( 

( 
f 
( 
( 
(2 
th 
sol 
(5 
(6 
an 
y( 
[ the 
[ 


ON ALMOST FREE LINEAR MOTIONS.* 


By Auret WINTNER. 


1. For large positive ¢, say for 
(1) <t<o, 
let f(t) be a continuous function having the following properties: The 


improper integral 


T 
(2) f f(t)dt is convergent 


(it need not converge absolutely) and, when considered as a function of the 
lower limit of integration, is such as to make 


(3) fe )dt < o. 


It will be proved that these assumptions are sufficient in order to make 
f(t) asymptotically negligible in the homogeneous, linear differential equation 


(4) x” + f(t)z =0 
(which is the disturbed form of the trivial differential equation 
(5) y” = 0, 


that of the free motion). By this is meant that, if x(t) is a non-trivial (+0) 
solution of (4), there exists a non-trivial solution, y(t) =c,t + c.0, of 
(5) satisfying 

(6) x(t) ~y(t) as too; 


and that, if (¢:,¢c2) 4 (0,0) is a non-trivial pair of integration constants of 
y(t), there exists a solution z(t) of (4) satisfying (6). 

2. A weaker result is contained in Bocher’s extension of the Fuchsian 
theory of “regular” singular points to real domain; cf. [1]. In fact, if 
Bécher’s independent variable is replaced by e-', where ¢ is the present 


* Received October 15, 1948. 
595 


| 
). 
l- 
1- 
‘ 
al 
1S 
ly 


596 AUREL WINTNER. 


independent variable, his result for the case of a multiple elementary divisor 
(loc. cit., $4) leads to the following criterion: If f(¢) is continuous on (1) 
and if 


(7) f @, 
then (4) has a solution satisfying 

(8) a(t) ~t 

and another, linearly independent, solution satisfying 
(9) u(t) > 1, 


as 0. 
Clearly, the existence of such a pair of solutions is equivalent to the 


one-to-one correspondence (6) between the solutions of (4) and (5). But 
(7) is more severe than (3) with (2). For, on the one hand, (7) implies, 
whilst (2) and (3) do not imply, that 


and, on the other hand, (7) is equivalent to 


FS « 


(Fubini), which requires much more than (3). 


3. According to (2), 


(10) F(t) f(s)as 
defines on (1) a function. Since this function is continuous and tends to 0 
as 0, 
(11) G(t) = max | F(s)| ; 
tSs< 


defines on (1) a function. Finally, (3) means that 


(12) H(t) = f G(s)ds 


Si 


ex] 


| 
v 
( 
w 
(3 
ar 
(1 
wl 
an 
(1 
t 


ON ALMOST FREE LINEAR MOTIONS. 597 


defines on (1) a function. Hence, the assertion, to be proved, is equivalent 
to the statement that, if f(¢) is continuous and such that (10), (11), (12) 
define functions on (1), then (4) has two solutions satisfying (8), (9) 
respectively. 

It will be sufficient to prove the existence of the solution satisfying (9). 
In fact, if f(¢) is any continuous function on (1) and if z(t) is any function 
which does not vanish for 7=t< o, where 7 >t, two differentiations 
show that the product 


(13) {a(s)} (TsSt<o) 
T 


is a solution of (4) whenever x(t) is. But if the solution x(t) satisfies (9), 
then the solution (13) is of the form 


(1+0(1)} f (1+ 0(1)}Fds— (14 01) }{t + o(t)} = 


as required by (8). 
Accordingly, it is more than sufficient to prove the existence of a solution 
satisfying 


(14) a(t) and (t— 0). 


4. To this end, it will first be shown that the recursion formula 


(15) (t) =P + fF ds, 
where 
(16) = F(t), 


and the “initial ” conditions 


(17) zn( 0) =0, 

where n = 1, 2,- - -, define on (1) a sequence of functions, z,(¢), 
and that 

(18) Wn(t) —{ | zn’(s)| ds 


t 


exists (< o) for n—1,2,--- and for every ¢ contained in (1). 


r 
it t 
0 | 
| 
| 
— 


598 AUREL WINTNER. 


The three functions (10), (11), (12) are supposed to exist. Hence, 


a(t) F(s)ds 


defines a function satisfying (16) and the case n=1 of (17), and (16) 
shows that the case n= 1 of (18) defines a function, w,(¢). In order to 
apply an induction, suppose that z,(t) and w,(t) exist for a fixed n. 
According to (17) and (18), 
| 2n(t)| S wn (t). 
Hence, from (11), 


since, according to (18), the function w,(¢) is non-increasing. On the 
other hand, by (11) and (18), 


t 


where, according to (12) and the monotony of w,(t), 
f G(u)w,(u)du S H(t)wnr(t). 
If the last three formula lines are compared with (15), it is seen that 


| ds S + H(t) 


In view of (18), this proves that the function wy,,(t) exists. It also follows 
that the function 


Znu(t) =— Zn’(s)ds (2ns1(00) = 0) 


t e 


exists. This completes the induction. 
In view of (18), the last inequality implies that 


| ds (E(t) + wi (2). 


| 
| u t 
re 
(: 
W. 
(3 
A 
( 
t 
to: 
(2 
t 


to 


he 


at 


Ws 


ON ALMOST FREE LINEAR MOTIONS. 599 


On the other hand, (10), (11), (12) show that, if ¢) is large enough, both 
G(t) <4 and H(t) <4 


} hold for every ¢ on (1). Finally, from (18) and (16), 


wi(t) | F(s)| ds +0 ast 
t 
by (11) and (12). 
5. The last three formula lines imply that 


(19) ds 


where e(¢) is a positive function satisfying «(t) ~0 as t-—> 0. Moreover, 
(20) | 2n(t)| < /2", 
by (19) and (17). 


In terms of the functions z,(¢), define the functions z,(¢) by the 


recursion formula 


(21) In(t) = 2n(t) + 
where 
(22) X(t) =1. 


According to (21) and (15), 


(23) ay! (t) =F(t)ana(t) 


provided that n > 1. However, since (22) and the case n= 1 of (21) lead 
to: 2,’(t) = 2,’(t), it is seen from (15) and (22) that (23) holds for n—1 
also. Finally, from (17) and (21), 

(24) =1. 


Clearly, (22), (23) and (24) represent a direct definition of the func- 
tions a(t). Their indirect definition is 


(25) an (t) =1-+ 3 2m(t), 


m=1 


t 
| 

t 


600 AUREL WINTNER. 


by (21) and (22). Since /2™ < «(t), the latter definition and (19), 


m=1 


(20) show that 


(26) | 1—2,(t)| < and f | ds < €(t), 


and (25) shows that a2,(¢) tends, as n—> oo, to a limit, say x(t), satisfying 
(27) | a(t) —an(t)| S €(t) /2" and f | d{x(t) —a2n(t)} | /2". 
t 


Since every 2,(¢), being differentiable, is continuous, the first of the inequali- \ 
ties (27) implies that x(¢) is continuous; and that 


(28) z(o) =1, 


by the first of the relations (26), where «(f) as 0. 


6. In order to carry out the limit process, n> 0, in (23), a sufficient } 
estimate of the first derivatives of the functions (21) remains to be ascertained. 


According to (15) and (11), 
| S 2n(t)| + f | | as 


Hence, if n is replaced by n —1, it follows from (19) and (10) that 
(29) | zn! (t) | S 4G (t)e(t) /2" 


This estimate is sufficient to assure the legitimacy of the limit process in 
question. In fact, 7(¢) was defined as the limit of z,(¢) asn—o. This 
means, by (25), that 


(30) —14+ 3 a(t). 


But (29) implies that the derived series of (30) is uniformly convergent. 
Hence, the function z(t) has a derivative, 


(31) = 3 a(t). 


In addition, z(t) is continuous, by (31) and (29), and since every 2,’(t) 
is continuous. Moreover, since G(t) > 0 and e(t) >0 as it is seen 
from (31) and (29) that 


t 

ir 
fy 
§ 


), 


it 


ON ALMOST FREE LINEAR MOTIONS. 601 
(32) co) == 0. 


It is now clear from the existence (4 «) of all three functions (10), 
(11), (12) that the limit process, n> 0, leads from (23) to 


a’ (t) = F(t) a(t) + f F(s)a’(s)ds. 


On the other hand, since (10) implies that F(t) > 0 as tf > o, it implies, 
by (28), that F(t)x(t) > 0. Hence, it is seen from (10) that a partial 
integration gives 


t t 


According to the last two formula lines, 2’(t) = f f(s)x(s)ds is an 
t 


identity. Since f(¢) and z(t) are continuous, the function on the right of 
this identity is differentiable. Hence, the same is true of the function on 
the left. Accordingly, «”(t) =—f(t)a(t). This means that z(t) is a 
solution of (4). On the other hand, (24) and (32) mean that x(t) satisfies 
(14). Hence, the proof is complete. 


Appendix. 


It may be mentioned that, if f(t) is real-valued and does not change its 
sign for large ¢, then the pair of conditions (2), (3), which is always sufficient, 
becomes necessary as well, in order that some solution of (4) should satisfy 
(9). In fact, the calculation made between (9) and (10) not only shows that 
(7) always implies (2) and (3), but it also shows that (2) and (3) imply 
(7) if it is assumed that one and the same of the inequalities 


(33) + f(t) 20 


holds from a certain ¢ onward. Accordingly, all that is claimed is that, 
in this particular case, (7) is necessary for the existence of a solution satis- 
fying (9). But the truth of this remark is known; it goes back to Weyl [2], 
§1. A simple proof proceeds as follows: 


t 
i- eo 

| 
_| 
| 
in 
is 


602 AUREL WINTNER. 


Suppose that a continuous f(t) satisfying the alternative (33) fails to 
satisfy (7). Then 


t 
(34) f sf (s)ds —> (t—> 
On the other hand, if (4) had a solution satisfying (9), then, since (4), 
(9) and (33) imply that 
(35) — sx"(s) = + | sx(s)f(s)| 


t 
holds for large s, it would follow from (34) and (9) that | f sv’”(s)ds |— « 


as t—» 0. Since a partial integration gives 
t t 
f sx’’(s)ds = const. + tr’ (t) —f x’ (s)ds, 


and since (9) implies the convergence of the integral f x(t) dt, it follows 


that | t2’(t)|—> 0. Hence, if 2’(¢) does not change sign from a certain ¢ 
onward, a quadrature shows that | z(t)| tends, as t-— 0, to (stronger 
than any fixed multiple of f dt/t logit). But this contradicts (9). Finally, 
z(t) cannot change sign from a certain ¢ onward, since 2’(t) is ultimately 
monotone. In fact, (35) shows that 2”’(¢) does not change sign (from a 
certain ¢ onward). 


THE JoHNS HopKINS UNIVERSITY. 


REFERENCES. 


[1] M. Bécher, “On regular singular points of linear differential equations of the | 
second order whose coefficients are not necessarily analytic,” Transactions 
of the American Mathematical Society, vol. 1 (1900), pp. 40-52. 

[2] H. Weyl, “ Ueber gewohnliche lineare Differentialgleichungen mit singuliren Stellen | 8 
und ihre Eigenfunktionen,” Nachrichten von der Kéniglichen Gesellschaft | 
der Wissenschaften zu Gottingen, 1909, pp. 37-63. 


wl 
(2 
If 
has 
(4 
the 
(3) 
(9) 
wh 
of 
of. 
for 
P(« 
xis 
ha: 
pe 
f. 


to 


the 
ns 


len 
aft 


ON THE SMALLNESS OF ISOLATED EIGENFUNCTIONS.* 


By AuREL WINTNER. 


1. In the linear differential equation 
(1) + (f(t) + A)e=0, 


where Ot < wo and —a2 <A< o, let f(t) be a real-valued, continuous 
function satisfying 
(2) — supf(t) < o. 


For every value of the parameter A, consider only real-valued solutions 


C2); = 2(0), c,=2'(0). 
If 
(3) x” + (f(t) 
has a non-trivial solution of class (L?), that is, a solution « = y(t) satisfying 
(4) o<f y2(t)dt < 
0 


then y(t) is called an eigenfunction (belonging to the A=A, occurring in 
(3) and to that boundary condition, say 


(5) + 2’(0) sind = 0, 


which is satisfied by z=—y(t) at ¢=—0). 

According to Weyl [3], p. 238, the assumption (2) precludes the existence 
of a A» corresponding to which (3) had two linearly independent solutions 
of class (LZ?). Correspondingly, every boundary condition (5) determines 
for (1) a spectrum S(¢), containing a (possibly vacuous) point spectrum 
P(¢). The latter consists of those values A») corresponding to which there 
exists an e=y/(t) satisfying (3), (4), (5). A theorem of Weyl implies 
that the cluster values of S(¢), that is, of the set consisting of the continuous 
‘spectrum and the derivative of the point spectrum, is independent of ¢; 
icf. [3], pp. 251-252. This invariant A-set, S’(d), can therefore be denoted 


* Received July 2, 1948. 
603 


8 


vs 
t 
er 
ly 
| 
| 
| 


604 AUREL WINTNER. 


simply by S’. Accordingly, a A is an isolated point of S(¢) if and only iff 
it is in P(¢) without being in 8’. 

By an application of the method used in [7], pp. 27-30, it was shown} 
in [2] that every real number not contained in S’ is in P(¢) for some ¢. 
By a further development of the same method, a method related to the 
Lebesgue-Toeplitz norm principle of arbitrary functions, a property of the 
eigenfunctions belonging to isolated points of the spectrum will be established 


in the sequel. 

2. The definition, (4), of an eigenfunction does not imply that y(t) > 0 
as t— o, and still less that y(t) tends to 0 fast. Actually, even if (2) is 
refined to 

(6) f(t) as to &, 


it is quite possible that (3) has for some Ao, say for 
(6 bis) A = 1, 


a solution «—y(t) which, though it satisfies (4), fails to be O(¢") as 
t—> oo: cf. [6], p. 269. On the other hand, the eigenfunctions tend very 
fast to 0 in all the “classical” cases of wave equations, that is to say, of 
differential equations (1) which are integrable “ explicitly.” 

The purpose of the following consideration is to give some account of 
this situation. The explanation will be that in the “ constructed ” examples, 
in which an eigenfunction fails to tend to 0 fast, the corresponding eigenvalue 
cannot be an isolated point of the spectrum. What will actually be proved 


is the following theorem: 


If a solution x=y(t) of (3), where (2) is assumed, satisfies (4), then 
either X =X, is not an isolated point of the spectrum or else 


(7) y(t) =O(t-’) for every (t—> «). 
It remains undecided whether (7) can be refined to 


(8) y(t) = O(e**) for some c > 0. 


In the explicit cases referred to above, as well as when f(¢) is periodic, (8) 
holds whenever (4) does. 

If (7%) could be refined to (8), at least when (2) is replaced by (6), 
there would result, as a corollary and with a direct proof, a theorem of 
Hartman [1], which states that (6 bis) cannot be an isolated point of the 
spectrum in the case (6). In fact, in order to obtain this as a corollary, it 


| 
\ 
( 
i 
i 
if 
W 
is 
d 
( 
§ 
fo 
( 
(a 


as 


ery 
of 


off 


les, 
ilue 


ved 


hen 


ON THE SMALLNESS OF ISOLATED EIGENFUNCTIONS. . 605 


would be sufficient to observe that, as easily verified (cf., e.g., [4], p. 391), 
a solution of (3) in the case (6), (6bis) must vanish identically if it is 
O(e*t) for some « > 0. 


3. Let x= y(t) be a function satisfying (4) and (3), where A, is some 
real number and f(/) is subject to (2). Suppose further that A, is not in 8’, 
the common derivative of every S(¢). It will be proved that these assump- 
tions imply (7). This will prove the italicized alternative (the claim of 
which is not disjunctive). 

First, since y(¢) does not vanish identically, y(0) and y’(0) do not 
vanish simultaneously, and so (5) holds for ey and for a certain 6 = do 
which is unique (mod). Choose any ¢ satisfying 


(9) $ (mod). 


It is easy to see that A, cannot then be in S(¢). 

In fact, S(¢) =S’+ P(o). But A is not in 8S’, by assumption. 
Hence, A» is not in S(¢@) if it is not in P(¢). Suppose, if possible, that A, 
is in P(d). Then, since ¢o was so defined as to make A, a point of P(do), 
it follows from (9) that two point spectra belonging to distinct boundary 
conditions have a point (—A,) in common. But this is impossible, since 
it means that (3) has two linearly independent solutions of class (LZ), 
which, as mentioned after (5), is prevented by (2). 

This proves that A» is not in S(¢). But to say that A» is not in S(¢) 
is equivalent to the following assertion (cf. [3], p. 251): The inhomogeneous 
differential equation 


(10) + (F(t) +Ao)v u(t) 


and the boundary condition (5), where « = v, have a unique solution v = v(t) 
satisfying 

oo 
(11) f v?(t)dt < 0, 


0 


_ whenever u(t) in (10), where < is given so as to satisfy the 


following restrictions: 


oo 
(12) u(t) is continuous and f u*(t)dt << @ 
0 


(all functions are meant to be real-valued). 


In particular, (10) has, for every continuous u(t) of class (ZL), a 


r if 
d. 
the 
the 
ied 
is 


606 AUREL WINTNER. 


solution v(t) of class (L?). Only this fact, which does not involve any 


boundary condition (and, correspondingly, no uniqueness) for v(t), will be | 


needed in the sequel. 


4. Let x= z(t) be any solution of (3) which is not a constant multiple 
of the solution z = y(t) of (3). Then, since y(t) does not vanish identically, 
the Wronskian of y(t) and z(¢) is a non-vanishing constant. The latter 
can be made 1 if z(t) is multiplied by a constant. Then 


(13) a(t)y’(t) — y(t)2(t) =1 


is an identity in ¢. Since both x= y(t) and r=—2z(t) satisfy (3), it is 
readily verified from (13) that, if u(t) is any continuous function, 


(14) vo(t) = y(t) 2(s)u(s)ds—2(t) y(s)u(s)as 


is a solution, v= vo(t), of (10). Hence, every solution of (10) is of the 


form 


(15) v(t) = ay(t) + be(t) + v(t), 


where a, b are constants. 

In view of (4) and of the linearity of the (Z*)-space, (15) will satisfy 
(11) for some a and some 0 only if (15) satisfies (11) for a0 and for 
some b. Hence, if (14) is substituted into (15), it is seen that (10) cannot 
have a solution v(t) of class (Z?) for a given continuous u(t) unless v(#) 


can be chosen as 


(16) v(t) = ba(t) + y(t) 2(s)u(s)ds—2(t) f(s) u(s)ds. 


Since (10) is supposed to have a solution of class (LZ?) whenever the given 
u(t) satisfies (12), it follows that there belongs to every u = u(t), satisfying 
(12), some constant 6=b, corresponding to which the function (16) 
becomes of class (L*). It will be assumed that the constant b = b, (which 
turns out to be unique) is so chosen. | 

In order to calculate b= b,, differentiate (16), multiply the derived 
relation by y(t), the relation (16) itself by —y’(t), and add. This gives 


t 
yo’ — vy +0— (ey u(s)y(s)as, 


| 
t t é 
( 
( 


any 


| be | 


iple 
lly, 
tter 


t is 


the 


isfy 
for 


ON THE SMALLNESS OF ISOLATED EIGENFUNCTIONS. 607 


an identity in ¢. It can be simplified to 


(17) w(t) ——b + f u(s)y(s)ds, 


if use is made of the identity (13) and of the abbreviation 


(18) w(t) — 


By Schwarz’s inequality, (4) and (12) imply that the product uy is of 
class (L.) = (L*). It follows therefore from (17) that w(t) tends, as t > oo, 
to a limit, and that w(co) is connected with the constant b= 6b, by the 
relation 


(19) + u(s)y(s)ds 


where the integral is absolutely convergent. 


5. It was shown in [5] that, if f, w satisfy (2), (12) respectively, 
then a solution, v(¢), of (10) cannot be of class (7) unless its derivative, 
v(t), is of class (L*). Hence, not only (16) but also the derivative of (16) 
must be of class (L*), if u(t) is any function satisfying (12) and 6 in (16) 
denotes the constant (19). It also follows that y’(t) is of class (Z?). In 
fact, y(t) is of class (LZ?) and is a solution of (3), i.e., of the case w=0 
of (12). ; 
Accordingly, all four functions v, v’, y, y’ are of class (Z*). But (18) 
shows that w is a bilinear form in these functions. Consequently, 


(20) w(t) is of class (L) = (ZL). 
Since the limit w( co) exists, it is clear from (20) that 
(21) w(oo) =0. 


But (21) and (19) reduce (17) to 


(22) w(t) — u(s)u(s)ds. 


It follows therefore from (20) that 


(23) f | f y(s)u(s)ds | dt < 0. 


0 


0 
ying 
hich 
28 
t 


608 AUREL WINTNER. 


Accordingly, the eigenfunction y(t) has the property that (23) holds for 
every u(t) satisfying (12). 
Since (12) is satisfied by u=y, it follows that 


(24) Sf raya < 00. 
0 t 
Hence, by Fubini’s theorem, 
fee) 8 
0 0 
so that 
(26) f ty*(t)dt < o. 
0 
6. In the above proof of (26), use had to be made of the fact that, 


due to (2), 
co 
(27) 00 
0 


is implied by (4). By adaptation of the proof of this implication (cf. [5]), 
it will now be shown that, due to (2), 


(28) ty’2(t)dt < 
0 


is implied by (26). 
In terms of the given f of (1), define f* and f- by placing 


f*(t) = max (f(t),0) and = min (f(#), 0). 


Then f* 20, f S0, and f=ft+f. Since (2) means that ft(t) =O(1) | 
as t—> oo, it follows from (26) that 


t 
f f*(s)sy?(s)ds = O(1). 


Hence, if = y(t) and f(t) = f*(t) + f-(#) are substituted into (3) and if 
the result is multiplied by ty(t), it follows that 


t 


0 


for 


nat, 


(1) 


1 if 


ON THE SMALLNESS OF ISOLATED EIGENFUNCTIONS. 609 


by (26). But 
sy(s)y”(s) = gsy"(s) — sy?(s) 


is an identity. Furthermore, a partial integration gives 


Finally, from (4) and (27), 
= 
0 


by Schwarz’s inequality. Clearly, the last four formula lines imply that 


t t 
(29) ty(tyy'(t) — sy’*(s)ds+ = 0(1). 
0 

In order to deduce (28) from (29), suppose that (28) is false. Then, 


as 0, 
t 
| sy’?(s)ds—>— o. 
0 


It follows therefore from (29), where f-(s) S0, that ty(t)y’(t)— o. In 
particular, y(t)y’(¢) is positive, hence y?(¢) is increasing, from a certain 
f onward. Since this contradicts (4), the proof of (28) is-complete. 

A by-product of this consideration is that the eigenfunction y(t) must 
have, with reference to the coefficient function of (1), the following property: 


(30) @. 


In fact, since (2) means that f(¢) is of the form f-(¢) + O(1), where 
f(t) S0, it is clear from (26) that the negation of the truth of (30) is 


equivalent to 
t 


sf-(s)y?(s)ds > — o. 


0 


On the other hand, from (29) and (28), 


ty(t)y'(t) + 


= 
| 


610 AUREL WINTNER. 


But the last two formula lines imply the relation ty(t)y’(t) — o which, 
as before, contradicts (4). 


7. The assertion (20) depended on the fact that both v(t) and v’(t) 
are of class (L*). On the other hand, (26) and (28) mean that both ¢#y(t) 
and ?#y’(t) are of class (Z?). Hence (18) shows that tw(t) is a bilinear 
form in functions of class (Z?), and so 


(31) tw(t) is of class (L) = (ZL). 


The proof of (26) depended entirely on (20). Correspondingly, if (20) 
is replaced by (31) in the deductions leading from (20) to (26), what 
results is the following refinement of (26): 


{ry (yat< 00. 
0 


This in turn leads to 
f t?y’*(t)dt < 0, 


if the argument which led from (26) to (28) is repeated. In fact, (28) can 
now be used instead of (27). 

Clearly, the process adapts itself to a complete induction and proves 
therefore that both 


(32) f tky?(t)dt << 
0 
and 
co 
(33) f ty’? (t)dt < 


0 


hold for arbitrarily large values of k. It also follows that (30) can be 
replaced by 


(34) f tt | f(t)| y2(t)dt < 


Let O0<t<T< oo and k>1. Since, by Schwarz’s inequality, 


T 


T 
t 


t 


— 
— 


ar 


an 


es 


be 


ON THE SMALLNESS OF ISOLATED EIGENFUNCTIONS. 611 


(33) supplies the existence of a constant C;, satisfying 
| ¥(7) | 4, 
Hence, (7) follows by letting T— o. 


THE JoHNS HopKINS UNIVERSITY. 


REFERENCES. 


{1] P. Hartman, “On the spectra of slightly disturbed linear oscillators,” American 
Journal of Mathematics, vol. 71 (1949), pp. 71-79. 

and A. Wintner, “An oscillation theorem for continuous spectra,” Pro- 
ceedings of the National Academy of Sciences, vol. 33 (1947), pp. 376-379. 

{3] H. Weyl, “ Ueber gewéhnliche Differentialgleichungen mit Singularititen und die 
zugehérigen Entwicklungen  willkiirlicher Funktionen,” Mathematische 
Annalen, vol. 68 (1910), pp. 222-269. 

{4] A. Wintner, “ The adiabatic linear oscillator,’ American Journal of Mathematics, 
vol. 68 (1946), pp. 385-397. 


[2] 


(5] , “(L*)-connections between the potential and kinetic energies of linear 
systems,” ibid., vol. 69 (1947), pp. 5-13. 

{6] , “ Asymptotic integrations of the adiabatic oscillator,” ibid., vol. 69 (1947), 
pp. 251-272. 

[7] » “On the location of continuous spectra,” ibid., vol. 70 (1948), pp. 22-30. 


t) 
0) 
at 


THE CLUSTER SPECTRA OF BOUNDED POTENTIALS.* 


By C. R. Putnam. 


1. Let be a real continuous function on the half-line 
0=2< © and let A denote a real parameter. The differential equation 


(1) (A—qgy=0 


is said to be in the Grenzpunktfall, in the terminology of Weyl [9], p. 238, 
if for some A (hence, for all A) (1) possesses a solution y satisfying 


J = 0. 
0 


Here, and in the sequel, only real functions will be considered. 


In the Grenzpunktfall, the differential equation (1) and a homogeneous 
boundary condition 


(2) y(0) cosa + y’(0) sinz =0, 


determine a boundary value problem for every fixed 2. It is known [9], 
p. 251, that the set, S’, of the cluster points of the spectrum of any such 
boundary value problem is independent of 2. Denote the least point (possibly 
+ 0) of S’ by Ay. It is known [5], p. 310, that the set of points constituting 
the spectrum of any fixed boundary value problem, determined by (1) and 
(2), cannot be bounded from above. Also, the set of points of the continuous 
spectrum either is empty or is unbounded; [12], p. 782. 
If qg is bounded, 


(3) |q | < const., 0=r< 
and even if g is just bounded from below, 
(3 bis) q > const., 


the differential equation (1) is in the Grenzpunktfall; [9], p. 238. According 
to the corollary in [3], p. 850, the condition (3) implies that A, satisfies the 
inequality A =A, = B, where A and B are defined by 


* Received December 11, 1948. 


612 


DUS 


ng 


he 


THE CLUSTER SPECTRA OF BOUNDED POTENTIALS. 613 


(4) A = lim inf q(z), B=1im sup q(2). 

The theorem to be proved in the present paper yields an extension of 
these results to information concerning the set S’. It will be shown that 
if q satisfies (3), then a “gap” in S’ (that is, a A-interval occurring in the 
decomposition of the (possibly empty) open set constituting the complement 
of S’ with respect to the half-line Ay = A < ©) cannot have a length greater 
than B—A. In particular, condition (3) implies that the set 8S’ is 
unbounded. The milder restriction (3 bis) surely is not sufficient to guar- 
antee this result; in fact, according to [9], p. 252, if q— oo as r—> 0, the 
set S’ is empty. 

All of this is contained in the following theorem: 

(*) If ¢q= q(x) ts continuous on 0S < and satisfies (3), then 
every A-interval, uy <A < po, contains at least one point of S’ whenever the 
length of the interval satisfies 


(5) — > lim sup g(x) — lim inf q(z) 


and the lower end point p, satisfies 


= lim inf q(z). 

As a consequence of (*), one obtains the theorem [2], p. 71, that if 
q(v) 30 as then every point of the half-line 0 [A < & belongs 
to 

It is known [9], p. 251, that a point A is not in the spectrum of the 
boundary value problem determined by (1) and (2), for a fixed «, if and 
only if for every continuous function f(x) of class (L?) on OS 4X < @ the 


inhomogeneous differential equation 
(6) 


possesses a unique solution of class (Z*) satisfying the boundary condition 
(2). The proof of (*) will depend upon this characterization of the spectrum 
together with an adaptation of the principle (which can be regarded as a 
manifestation of the Lebesgue-Toeplitz norm construction) used in [11], 
pp. 26-27, and then applied in [4], [8] and [2], for the location of points 


of the spectrum. 


2. Proof of (*). Let ¢=¢(x,A,a) denote that solution of (1) 


satisfying (2) for which 


ine 

ion 

38, 

9], 

ch 

ng 

nd 
us 

= 


614 Cc. R. PUTNAM. 
$(0,A,«%) =—sinae and ¢$’(0,A, «) = cos a, 


where the prime denotes partial differentiation with respect to xv. Let 
Ai, A2,* + * denote the (possibly empty) set of points of the point spectrum 
of the boundary value problem determined by (1) and (2). If $; denotes 
the normalized eigenfunction belonging to the eigenvalue Aj, then 


(7) L ($5) + = 0, 
where L(y) denotes the differential operator 
(8) L(y) =y" —qy- 


According to Weyl’s application [9], pp. 239-251, of Hellinger’s spectral 
resolution theory [6] to differential equations, there exists a continuous mono- 
tone function p= p(p) such that the function P = P(z2), 


is of class (Z*) and satisfies 
(9) L(P(2,d2)) +AP(2,A,2) 


In terms of the eigendifferentials d,\P = dP(2,A,a) the differential equation 
(9) may be expressed as 


(9 bis) L(dP) +adP =0. 


The eigenfunctions ¢; together with the eigendifferentials dP constitute an 
orthonormal complete set of functions on 0S 2< o; hence, if y is any 
function of class (Z*) on 02 < ©, with “ Fourier coefficients ” 


(10) dz, ar(a) = f y(z)aP(z,a, a) dr, 
the Parseval relation 

11 (dr)2/d ? 


holds. 


Since the addition of a constant c to q(x) merely translates the spectrum, 
and hence 8’, by c, it can be supposed, without loss of generality, that 


(12) 0SB=—A< 


— 
— 


Let 
rum 
10tes 


tral 
ono- 


tion 


any 


um, 


THE CLUSTER SPECTRA OF BOUNDED POTENTIALS. 615 


(cf. (4)). The proof of (*) will be complete if it is shown that the A-interval 
pi <A< po, Where p, = — 8B, contains at least one point of S’ whenever 


— Pi 2B. 

Suppose, if possible, that the interval »; << A < we contains no points of 
S’. Choose an arbitrary positive number e so small that p; + 2% < p2— 2% and 
that 


(13) A¥ = + pe) /2 


is not an eigenvalue. It is clear that A*>0. Denote by A’, A’,- -,A™, 
where M = M,, the eigenvalues (if any) satisfying 

(14) 
and denote by ¢', ¢*,- - -,¢™” the corresponding normalized eigenfunctions. 


In the sequel, the symbol f will always denote a continuous function, of 
class (LZ?) on 0O=2< satisfying 


(15) f 
and ° 
(16) <e 


Any statement involving “ independent of f” will mean independent of the 
choice of f in the class of continuous functions f for which (15) and (16) 
hold. Consider the inhomogeneous differential equation (6) for A4—A%*, 
that is, by (8), 


(17) L(y) +A*y =f. 


Since A* does not belong to the spectrum of the boundary value problem 
determined by (1) and (2), there exists a unique solution y of class (L?) 
satisfying (2), cf. [9], p. 251. It follows from the Green identity (cf. [9], 
pp. 241-242) that 


Moreover, one obtains from (7) and the first relation of (10) 


yL (pj) dx = — Aje; 


616 C. R. PUTNAM. 


and from (9) or (9bis) and the second relation of (10) 


f = — 
7X 


(Cf. [7], pp. 782-785; there, however, the contributions of both the point 
spectrum and the continuous spectrum are included in the function P(x, A, ).) 
It follows from (17) and the last four formula lines that 


(18) S (A¥ — Aj) 
and 
(19) f fAPdr = (A* — pw) dT (p). 
“xr 


If » is defined by 
(20) — pr) /2 — 2e, 


then relations (18), (19) and the Parseval relation for f yield the inequality 


where AT and c; are defined by (10) and where the asterisk indicates that 
the integration and summation are to be taken over that part of the spectrum 
exterior to the A-interval p, +¢«=ASp,—e. Since the only contribution 
of this interval to the spectrum is that due to the points A‘, A*,- - -.A” 
(cf. (14)), relations (11), (16), (18) and (21) imply that 


f Pac f —e). 
0 0 


Thus, in virtue of (15) and (20) 
(22) rf yde <C, 
0 


where the constant C, depends on ¢« but is independent of f (and y). The 
inequality (22) will play a fundamental role in the calculations to follow. 

Henceforth, it will be supposed that the boundary value problem under 
consideration is that determined by the differential equation (1) and the 
boundary condition (2) for «0, that is, y(0) 0. This is no loss of | 
generality since S’ is independent of the choice of « in (2). Let the function 
y=y(zr) on < be defined by 


(23) y = sin 


| 


yint 


lity 


hat 
um 
ion 


THE CLUSTER SPECTRA OF BOUNDED POTENTIALS. 617 


so that y is real since A*¥ > 0. The next step in the proof of (*) will be to 
show that there exists a number NV = N,, depending only on e and A* and 
not on f or y, having the property that on any z-interval of length N there 
exists at least one point € such that 


(24) | —9 (E)| 


In virtue of (17), 


x x x 


while an integration by parts yields 


x 
—f | + + qy*) dz. 


Since the functions y and f belong to class (L?) and q satisfies condition (3), 
it follows from the proof of the theorem [10], p. 6 (cf. also [1]), that 
y(X)y’(X) 30 as X— oo. Since y(0) —0, one obtains from the last two 
formula lines that 


(25) f + gy’) dx = r* J yede— f fyde. 
70 0 0 


An application of the Schwarz inequality to the second integral on the right 
of the last inequality yields, in virtue of (15), 


Relations (25), (3) and (22) imply 


ve 0 


(26) J < 
0 


where the constant D, is again independent of f. The existence of the number 
N is clearly a consequence of (22), (23) and (26). 

It will be shown that in virtue of the above results it is possible to define 
a continuous function f(x) on OS a< o, satisfying (15) and (16), and 
to select a pair of positive constants €, < & in such a way that the function y 
(which depends on the choice of f) satisfies the inequality 


£5 
(27) (B+o( f 


| 
| 
‘he 
yw. 
ler 
the 
of 
on 


618 Cc. R. PUTNAM. 


Suppose this has been proved. Then, by the first inequality of (22), 
(B +e) (1 + = (1— —2e. 

Since « can be chosen arbitrarily small, the last inequality and (20) imply 
2B/ (M2 — 1, that is, 
(28) — pa S 2B. 
Since (28) is in contradiction with the assumption p.— pw, > 2B, the supposi- 
tion that the A-interval », << A< contains no points of S’ is untenable. 

Thus, in order to complete the proof of (*), it remains only to establish 


relation (27) for suitably chosen f and €,, € In virtue of (12) and (4) 
there exists a positive number 7’ = T, so large that 


b 

(29) ( f ¢vae/ y?dz)§ = B+ whenever.b >a=T. 
In addition, T can be chosen so that 
(30) y(T) =0 
and 

M 

j=1 


On the other hand, (23) shows that f y?da = «. Hence, if T is fixed, 
0 


U-N U 
(32) y*dz/ > 1—e 
T+N 


T 


whenever U =U, (> 7+ 2N) is sufficiently large. Let U (>T+2N) 
be chosen so as to satisfy (32), y(U) =0 and 

U-N 
(33) = 1. 

T+N 


Define the function f(x) on OS a < by 


0 if ow. 


It follows from (30) and the definition of U that f is continuous on 
0<2z< and satisfies (15). That f also satisfies (16) is a consequence 
of (31) and (15). Finally, the property of the number N shows that there 
exist numbers €, and &€, satisfying 


it 


ar 


bu 


Fi 


) 


( 

a 

a 

( 

( 

| 

| 


THE CLUSTER SPECTRA OF BOUNDED POTENTIALS. 619 


and (24) for €=é, and é=&,. It is clear from (32), the definition of U 
and (34) that 


| (35) J, vide > 
Also, (33) and (34) imply 
(36) f Wade 
& 
Since y satisfies the differential equation 
+ = 0, 
it follows from (17) that 
— = + fy 


and, consequently, 
f fo 
& & & 
Hence, in virtue of (24) for =, and é=6,, 
& & 
but, from (35) and the definition of f, it follows that 
* & 
& & 
The last two relations and (36) show that 
& 
Finally, the Schwarz inequality and relations (29) and (34) give 
bs 
& & & 
The last two formula lines imply (27), and so the proof of (*) is complete. 


Tue JoHns HopkKINs UNIVERSITY. 


620 C. R. PUTNAM. 


REFERENCES. 


[1] P. Hartman, “ The Z?-solutions of linear differential equations of second order,” 
Duke Mathematical Journal, vol. 14 (1947), pp. 323-326. 

[2] » “On the spectra of slightly disturbed linear oscillators,” American Journal 
of Mathematics, vol. 71 (1949), pp. 71-79. 


[3] and C. R. Putnam, “ The least cluster point of the spectrum of boundary 
problems,” ibid., vol. 70 (1948), pp. 849-855. 

[4] and A. Wintner, “An oscillation theorem for continuous spectra,” Pro- 
ceedings of the National Academy of Sciences, vol. 33 (1947), pp. 376-379. 

[5] and A. Wintner, “On the orientation of unilateral spectra,” American 


Journal of Mathematics, vol. 70 (1948), pp. 309-316. 

[6] E. Hellinger, “‘ Neue Begriindung der Theorie quadratischer Formen von unend- 
licher Verinderlichen,” Journal fiir die reine und angewandte Mathematik, 
vol. 136 (1909), pp. 210-271. 

[7] C. R. Putnam, “ An application of spectral theory to a singular calculus of varia- 
tions problem,” American Journal of Mathematics, vol. 70 (1948), pp. 
780-803. 

, “On the spectra of certain boundary value problems.” ibid., vol. 71 (1949), 
pp. 109-111. 

[9] H. Weyl, “ Ueber gewéhnliche Differentialgleichungen mit Singularititen und die 
zugehérigen Entwicklungen willkiirlicher Funktionen,” Mathematische 
Annalen, vol. 68 (1910), pp. 220-269. 
[10] A. Wintner, “ (Z*)-connections between the potential and kinetic energies of 
linear systems,” American Journal of Mathematics, vol. 69 (1947), pp. 5-13. 
{11] , “On the location of continuous spectra,” ibid., vol. 70 (1948), pp. 22-30. 
[12] , “On Dirac’s theory of continuous spectra,” Physical Review, vol. 73 
(1948), pp. 781-785. 


[8] 


re 


0: 

f 

ig 

T 

fi 

( 

A 

ar 

= 

to 

co 

Lt 

th 

is 


ON THE NON-VANISHING AT s=1 OF CERTAIN DIRICHLET 
SERIES.* 


By GrorcE SHAPIRO. 


By a multiplicative function is meant a function, f(n), which is defined 
on the set of positive integers, is not identically zero, and _ satisfies 
f(nm) =f(n)f(m) if (n,m) =1. If the last proviso is unnecessary, that 
is, if f(nm) =f(n)f(m) for all n and m, then f(n) is called a completely 
multiplicative function. 

The restriction that f(m) is not identically zero is equivalent to f(1) = 1. 
Therefore any completely multiplicative function may be obtained by choosing 
f(p) for each prime, p, and by setting 


Any multiplicative function may be obtained by choosing f( p*) for each prime, 
p, and each positive integer, /, and setting 


(2) f(m) =f(m™) if n= + + and f(1) =1. 


Formally, (1) and (2) are equivalent to the Euler relations 


(3) = TE (1 
and 


respectively, where > = > in (3) and (4), and in the sequel. The state- 
n=1 

ment that f(n) is a bounded completely multiplicative function is equivalent 

to | f(n)| 1. In this case, (3) is valid for Rs > 1 by virtue of absolute 

convergence. 

If f(n) is a residue character mod m, then the non-vanishing of 
L(s) => f(n)/n8 at s =1 is of prime importance in the proof of Dirichlet’s 
theorem that infinitely many primes occur in the sequence h,h + m, h + 2m, 

-, where (h,m) —1. In this case, however, the absolute value of f(1) 


is either 0 or 1. 


* Received December 1, 1948. 
621 


val 
ry 
19, 
an 
ik, 
lie 
3. 
30. 
| 


622 GEORGE SHAPIRO. 


Ingham [1] gave a simple direct proof that L(1) #0 if f(n) is any 
residue character; previously the case in which f(n) assumes only the values 
0, + 1, —1, required separate consideration. Wintner [3] extended Ingham’s 
method to bounded, real, completely multiplicative functions and, in fact, 
to those satisfying —1=f(p) for all p. In this paper, a combination of 
the methods of Ingham [1] and Wintner [3] is used to generalize Wintner’s 
result to complex functions. In Theorem I below, Wintner’s result for real 
bounded functions is extended to complex bounded functions. In Theorem 
II the boundedness condition is weakened in certain directions; however, 
Wintner’s result is not obtained as a corollary. 


THeorEM I. If f(n) is a bounded completely multiplicative function, 
then L(s) = > f(n)/n* is absolutely convergent foro—R(s) >1. If L(s) 
possesses a regular analytic continuation onto a domain containing the real 
segment 4 =o =1, then L(1) £0. 


Proof. Let 
E(n) = 


Then > | E(n)| */n® is absolutely convergent for o > 1 since | E(n)| Sd(n), 
the number of divisors of n. Let J(s) = >| f(n)|?/n’, so that the corre- 
sponding Euler relation (3) implies 


=I ito > 
It follows that if o > 1, then 
(5) | E(n)| = (8) (2s) 5 
ef., e.g. [1]. 
If H(n) is defined by 
H(n) =1 or H(n) 


according as n= 1.or n > 1, then for a fixed prime p, 


(1— | f(p)| */p*) (1 —1/p*)* (1 — | f(p)|*)/p"* = H(p")/p". 


Clearly, H(n) is multiplicative and satisfies 0 = H(n) =1; hence, for o > 1, 


H(n)/nt (1—| f(p)| */p*) {(s)/J(s). 


i 
] 
t 
t 
( 
8 
| ( 
W 
| 
t 
| 


re- 


'8 


THE NON-VANISHING OF CERTAIN DIRICHLET SERIES. 623 


Multiplying the last relation by (5), we obtain 
(6) €(s)L(s)L(s)/J(2s) = (> H(n)/n*)( £(n)| ?/n*) for > 1. 


The two series on the right are absolutely convergent and so their product 
may be represented by a Dirichlet series, | G(n)/n*, which also converges 
absolutely for o >1. Clearly, G(1) =1 and G(n) 20. 

Now J(s) is regular and does not vanish for o > 1; therefore the same 
is true of J(2s) when o>4. The function ¢?(s) is regular for o>4 
except for a double pole at s=1. If we assume that Z(1) —0, then 
L(1) =0 and the pole of £°(s) at s=1 is absorbed in the product on the 
left of (6). Consequently, the function on the left of (6) is regular for 
o > 4, and by Landau’s extension [2] of the Vivanti-Pringsheim theorem, 
the series > @(n)/n* converges absolutely and represents the function on 
the left of (6) for o > $. 

Now two cases arise, 


(i) = | f(n)| 
or 
(ii) D> | f(n)| 2/n = o. 


If (i) is true, then S| f(p)|?/p< © and >’1/p< o, where ” indicates 
summation over those primes for which | f(p)|24. Also (6) can be 


written as 
(7) ¢°(s)L(s)L(s) = J (2s) G(n)/n’, 
where the two Dirichlet series on the right are absolutely convergent for 


¢ > 4, so that their product may be represented there by an absolutely con- 
vergent Dirichlet series, say }e(n)/n*. Thus 


¥ e(n) = (Sf (n)/n’), 
where 


e(n) ( Sf(m)f(k/m)) ; 


in particular, e(p) =2 + 2%f(p). Now >| e(p)|/p <0; and if | f(p)| < 4, 


then | e(p)| 1, so that 3” 1/p <0, where the ” indicates summation over 


those primes for which | f(p)| <4. Hence, if (i) holds, we obtain the con- 


tradictory result 
The case (ii) will now be considered. Since G(1) =1 and G(n) 20, 


the relation (7) implies 


| L(o)|* = | J(2e)| for o > 


any 
n’s 
ict, 
of 
eal 
em 
er, 
on, 
s) 
eal 
), 


624 GEORGE SHAPIRO. 


But this again leads to a contradiction since | £(¢)L(o)|* has a finite limit 
as o tends to $ + 0, whereas (ii) implies that | J(20)|— © aso>}-+ 0. 

Consequently, the assumption L(1) =0 is untenable and the proof of | 
Theorem I is complete. 

A somewhat different treatment is required in order to extend the result 
to unbounded completely multiplicative functions. We first prove two lemmas 
concerning the non-vanishing of the Dirichlet series with coetficients »?(n)f(n), 
where p(n) is the Mobius function. 


LemMa A. Let g(n) be a multiplicative function with the properties 
that 
(8) g(ps) =0 if k>1 
and 
(9) Rg(p) = — ZF for all primes p. 


If G(s) 9(n)/n* converges absolutely on some half plane > oo (> 1) 
and possesses a regular analytic continuation onto a domain containing the 
real segment 1 =o Sa, then G(1) 0. 

Proof. First, G(s)@(s)¢(s) => N(n)/n* for ¢ > oo, where N(n) is 
the multiplicative function, 


N(n) = Zg(n/k) SG(m); 


so that N(p) 28g(p) and 
= (14+ 9(p)) (14 G(p)) 20 if k>1. 
Since (9) implies VN(p) 2 0 as well, N(n) = 0 for all n. 


Now if G(1) = 0, then is regular for = 1 and vanishes 
at o=1. By Landau’s extension [2] of the Vivanti-Pringsheim theorem, 
the series © N(n)/n*® converges and represents this function in this region. 
In particular, } V(n)/n =0, which is clearly impossible. 


Lemma B. Let g(n) bea multiplicative function with the properties that 


(8) g(pt) =0 if k>1 
and 


(10) Rg(p) =—1 and 2+ 4%g(p) + | 9(p)|* 20 for all primes p. 


If G(s) =X g(n)/n* converges absolutely on some half plane > (> 4) 


the 


hes 


hat | 


THE NON-VANISHING OF CERTAIN DIRICHLET SERIES. 625 


and possesses a regular analytic continuation onto a domain containing the 
real segment 4 then G(1) €0. 


Proof. First, £°(s)/(2s) = > 2”™ for >1, where v(n) is the 
number of distinct prime divisors of n. Hence 


(11) G(s) = N(n)/n* for ¢ > oo, 
where N(n) is the multiplicative function determined by 
N(p) =2 + 2Rg(p), N(p*) =2 + 498tg(p) + | 9(P)|? 


and, if k > 2, 
N(p*) =2+48g(p) +2] 9(p)|?20. 


The condition (10) assures that N(n) = 0 for all n. 


If G(1) = 0, the function on the left of (11) is regular on the real half- 
line o > 4; by the theorem referred to before in [2], the series on the right 
converges and represents this function there. Then 


| G(o)f(o) |? = | E(20)| (n)/n7 | £(2e)| 


for ¢ >}. But this is impossible, for the left side is bounded while the 
right side tends to infinity as o tends to $+ 0. 

An application of either of these lemmas allows the boundedness assump- 
tion on f(m) in Theorem I to be weakened in certain directions; and in the 
case of an application of Lemma A, a relaxation of the assumption concerning 
the analytic continuability of L(s) is also possible. 


THEOREM I]. Let f(n) be a completely multiplicative function with 
the properties that > f(n)/n* converges absolutely on some half-plane o > ao 
(> 1), that S f(n?)/n® converges and is not 0, and that g(n) = p?(n)f(n) 
satisfies either (9) or (10) (as well as (8)). Then G(s) = g(n)/n8 
converges absolutely for o>. Suppose, further, that G(s) satisfies the 
assumption of analyticity made in either Lemma A or B, according as g(n) 
satisfies (9) or (10). Then L(s) possesses a regular analytic continuation 
onto a domain containing the real segment 1 << o Sa, and L(1 +0) ezists 
and is not zero. 


Proof. Let r(n) be f(n) or 0 according as n is or is not a square. 
Then r(n) is multiplicative and the series }r(n)/n* converges absolutely 
for > ao. Furthermore, 


sult | 
mas 
(n), | 
ties | 

1) 
|_| 

is 
om, 
on. 
3) 


626 GEORGE SHAPIRO. 


= | 
since 


9(1)r(p*) + = (kK=1,2,- 


Thus L(s) = G(s) S f(n?)/n* for ¢ >; so that L(s) can be continued 
analytically onto a domain containing the real segment 1<o 4, while 
L(1+0) exists and 

L(1+ 0) =G(1) Xf(n*)/n?. 


Hence L(1-+ 0) ~0, since G(1) #0 by Lemma A or B, while the second ( 
factor in the last formula line does not vanish by assumption. 


THE JOHNS HOPKINS UNIVERSITY. 


REFERENCES. 


{1] A. E. Ingham, “ Note on Riemann’s zeta function and Dirichlet’s L-functions,’ 
Journal of the London Mathematical Society, vol. 5 (1930), pp. 107-112. th 

[2] E. Landau, “ tber einen Satz von Tschebyschef,” Mathematische Annalen, vol. 61 
(1905), pp. 527-550. 

[3] A. Wintner, “The fundamental lemma in Dirichlet’s theory of the arithmetical co 
progressions,” American Journal of Mathematics, vol. 68 (1946), pp. 385-392. re: 


th 


le 
al 
n 
an 
sin 
rel, 
(3 
the 
wh: 
(4) 


ad 
ile 


id 


OSCILLATORY AND NON-OSCILLATORY LINEAR 
DIFFERENTIAL EQUATIONS.* 


By HartMaAn and WINTNER. 


In the differential equation 
(1) a” + f(t)r 0, 
let f(t) be real-valued and continuous for large positive ¢t, say for 
(2) 


and consider only those solutions z(t) of (1) which are real-valued and do 
not vanish identically. These obvious restrictions of the coefficient function, 
f(t), and of the admitted solutions, x(t), of (1) will not be repeated in the 
‘vording or the proof of the results below. 

The object will be to delimit, on the one hand, general possibilities for 
the behavior either of some or of all (i.e., either of one or of two linearly 
independent) solutions of (1) as f—> o and to obtain, on the other hand, 
corresponding results for spectra (in Hilbert’s sense). The latter aspect 
results if f(t) in (1) is replaced by f(t) +A, where A is a real parameter, 
and an homogeneous linear boundary condition is assigned at t = 0. 

As the various possibilities lead into quite different directions, each of 
the chapters will have its own introduction, explaining the problem at hand. 


The Oscillatory Case. 


If f(4) is a positive constant, then every solution x(t) of (1), being a 
sine curve, is O(1) as oo. On the other hand, if f(t) — const. > 0 is 
relaxed to the assumption that 


(3) lim f(¢) exists and is positive (+4 0), 
to 0 
then, as shown by Perron [14], it is possible for (1) to have a solution for 
which 
(4) lim sup | z(t) | = 


* Received October 2, 1948. 
627 


| 

| 

61 

al 


628 PHILIP HARTMAN AND AUREL WINTNER. 


From his counter-example, it is not possible to answer either of the following [ 
questions: If (3) is assumed, can (4) hold («) for some but not for every, { 
(8) for every, solution x(t) of (1)? The first part of (i) below answers both 

of these questions in the affirmative. The second part of (i) represents an 


unexpected residue of the O(1)-situation prevailing in the trivial case 
f(t) = const. > 0. 


(i) Under the assumption (3), 


(a) some but not every, (B) every, 

solution x(t) of (1) can fail to satisfy 
(5) x(t) =O(1) as to &. | 

On the other hand, (3), and even the more general assumption 
(6) 0 < lim inf f(t) = 
implies the existence of a sequence of t-intervals (t'n, t’n) which tend to t = © 
as n—> 0, are independent of the choice of the pair of integration constants 
determining a solution of (1), and have the following property: If t* varies 
on the (unbounded, open) t-set 
(7) 3 | | 
a set depending only on f, then 
(8) x(t*) =O(1) as t* > | 
holds for every solution x(t) of (1). 

If the assumption (3) is replaced by its limiting case, | 
(9) aston, 
and if the “growth” of f(t) is smooth enough, it follows from asymptotic 
formulae that 


(10) >0ast>o 


holds for every solution of (1); cf. [21]. But if the last italicized proviso / 
is omitted, i.e., if only (9) is assumed, then not even (5) can in general be 
asserted. For an “ explicit ” example of such an f(t), consisting of a transfer 
of Perron’s construction from (3) to (9), ef. [22]. 


aA 


| 
| 
( 
( 
te 
0 
0 
be 
ac 
( 
re 
Ww 
| 


Lie 


OSCILLATORY AND NON-OSCILLATORY EQUATIONS. 629 


This leaves, however, undecided, for (9), both questions answered, for 
(3), by the first part of (i) above. The answers are supplied by the first 
part of the following variant of (i). 


(ibis) Both contingencies (a), (8), described in (i) under the assump- 
tion (3), can occur if (3) ts replaced by (9). 


On the other hand, in the particular case (9) of (6), tt is possible to 
refine (8) to 


(8 bis) a(t*) =0(1) as t* > 


if the unbounded, open t*-set (7%), which is independent of the pair of inte- 
gration constants determining a solution x(t) of (1), ts suitably chosen. 

The second part of (ibis) may be interpreted as what can be saved 
from (10) if the proviso italicized before (10) is omitted. 


Proof of (7), (8)-(8bis). Let N(t) denote the number of zeros 
possessed by a solution of an arbitrary differential equation (1) on the interval 
[0,¢]. It is known (cf., e.g., [6]) that, if 2,(¢) and x.(¢) are two linearly 
independent solutions of (1), and if their Wronskian, which is always a non- 
vanishing constant, is normalized to be 1 (or —1), then 


(11) aN (t) du/r?(u) + O0(1) as to 
where ° 
(12) (2,?-+ > 0 


(in fact, 2, = 2,(¢) and z, cannot vanish simultaneously). Needless 
to say, the step-function N(¢), as defined before (11), depends on the choice 
of the particular solution, say x(t), whose zeros N(¢) enumerates.. On the 
other hand, it follows from Sturm’s separation theorem that two N-functions 
belonging to two solutions of (1) cannot differ by more than 1 at any #, and 
so the indeterminacy of N(¢) is absorbed by the O(1) in (11). 

All of this is independent of the assumption (6). It will be shown that, 
according as (6) or its refinement (9) is assumed, 


(13) lim inf r(t) < or lim infr(t) =0, 0), 


respectively. This, when combined with the continuity of the function r(¢), 
will prove the respective assertions of (i), (ibis) concerning the existence of 
open, unbounded sets (7) on which (8), (8bis) hold. In order to see this, 


| 
th | 
all 
its 
| 
so 
be 
er 


630 PHILIP HARTMAN AND AUREL WINTNER. 


it is sufficient to compare (13) with (12) and to observe that, since 2x,(¢) 
and x,(¢) are linearly independent, every solution of (1) is of the form 


(14) x(t) =c,a,(t) + c2r2(t), 


where c, and cz are independent of ¢. 


For large ¢, the linear oscillator 7” + w*2 = 0 is a “ Sturmian minorant ” 
of (1) for some or for every choice of the positive constant w, according as 
(6) or (9) is assumed for (1). Since the asymptotic density of the zeros of 
the solutions of the linear oscillator is proportional to w, it follows that, if 
N(t) refers to (1), 


lim inf N(t)/t > 0 or N(t)/t—> (t—> 


according as (6) or (9) is assumed. Finally, it is seen from (11) that the 
last formula line implies the truth of the respective assertions (13). 

The remaining four assertions, viz., («) and (8) in (i) and (ibis), 
will now be proved in order. 


Proof of («) im (i). Corresponding to every a> 0, it is possible to 
choose an f(¢) satisfying (3) and having the property that some solution of 
(1) becomes O(t~*) as t—> o; cf. [20], p. 269. In particular, (3) is com- 
patible with the existence of a solution satisfying (10). But if z(t) is a 
solution satisfying (10), then any solution linearly independent of x(t) must 
violate (5); cf. [18], p. 397. i : 


Proof of (8) in (i). This contingency, too, is between the lines of [18], 
the situation being as follows: 


(A) If f(t) is subject to (3), and if a solution x= -2*(t) of (1) is 
such as to satisfy 


lim inf a*, = 0, 0), 


where a*,,a*,,- - - is the sequence of the relative maxima of | z*(¢)|, then 
every solution z(t) linearly independent of this x*(¢) must violate (5). In 
fact, if the assumption made by the last formula line is strengthened to 
a*, —> 0 as n—> ©, then, since the latter assumption is equivalent to the case 
x= x* of (10), the violation of (5) by every x(t) const. x*(t) becomes 
the assertion used at the end of the above Proof of (a) in (i). On the other 
hand, a glance at the proof ([18], p. 397) of this assertion shows that what 
is actually used there is just the assumption made in the last formula line, 
rather than its strengthened form, which is (10). 


| 
| 


| 


64 


| 

S( 
| 
(s 


OSCILLATORY AND NON-OSCILLATORY EQUATIONS. 631 


t) (B) According to [18], p. 396, a function f(¢) satisfying (3) can be 

chosen as follows: Some solution x(t) of (1) satisfies both 

lim inf a, = 0 and limsup a, = (n—> 0), 

where 4, denotes the sequence of the relative maxima of | z(t)|. 
os If (A) is compared with (B), it is seen that there exists an f(t) 
as satisfying (3) and belonging to the contingency (8) of (i). 
cs Proof of («) and (8) in (ibis). The proof depends on an adaptation 
| oof a known principle of construction (cf. [15]; also [13]). 

Choose a sequence of positive numbers A2,° - satisfying 
(15) lim A, = o and 31/A.—= 0, 
© 


k-1 
put By = 42731/d;, where =0, and define by placing 


f(t) =A? when St < (i 1,2,- -). 

0 

of Thus f(¢) is defined for 0=t < o by virtue of (15). This f(t), instead 
n- of being continuous throughout, is a step-function but, as far as the purposes 


a at hand go, the jumps of f(¢) can be smoothed out by “small” local altera- 
st | tions of the graph f—f(t). Clearly, (9) is satisfied. 


Let d,,d2,- and @, denote the sequences of numbers 
1 k-1 k-1 
j=1 j=1 

so that 
is 

(17) Cx = 

Put 
a(t) = (—1)*dy cos Axx (t — Bou) or = (—1)¥ Sin — Borer) 
n according as 
| Box Bors: OF Boxs2 
e 
| (so that z,(t) is defined for < o), and put 
t2(t) 1)*e;, COS (t Boxs1) or X2(t) (— sin Aorso(t Bors) 
| according as 


Boks: = t< Bokse or Boks2 = k< Bokss 


632 PHILIP HARTMAN AND AUREL WINTNER. 


(so that z.(t¢) is defined for B; St< co). Then, since f(t) is the constant 
dx? when St < Bess, both x —-2,(t) and x —2.(t) are solutions of (1) 
on each of the successive intervals B, << ¢ < Bx... In addition, the constants 
are adjusted so as to assure the existence of continuous first derivatives 2’,(t), 
x(t) throughout (i.e., at the points t=; also). This is seen from the 
definition of the ,’s. 


Ad (a). Choose 
Aoj = fj? and (j= 
so that (15) is satisfied. Furthermore, 
dy, = (k —1)! and =1/[k—1) !k?], 
by (17). Hence, by the above definitions of z,(¢) and 22(¢), 
2,(t) O(1) but =0(1), as tao. 


In view of (14), this means that a solution z(t) is bounded, as t—> o, if 
and only if z(t) const. 22.(t). Consequently, the present f(¢) is of the 
kind required by (a) in (ibis). 


Ad (8). Choose a sequence A;,A2,° - - satisfying (15) and having the 
property that 


(18) lim sup d; = o and lim sup = (k— 


hold for the numbers defined by (16). [Such sequences dj, Az,- - - exist; one 
results by placing 


and 
=J and = J? If hone S 7 < 


where 1=k, << ki is a suitable sequence of integers.] Then, if the | 
pair of alternative formulae defining z(t) and z2(t) is compared with (16), 
(17) and (18), it is readily verified that a linear superposition (14) of 2,(¢) 
and z.(t) is bounded as t—> o only if both constants ¢,, c2 of (14) vanish. 
This means that (1) has no bounded solution ($40), as required by (8) 
in (ibis). 


= 


| 
S| 


nt 
1) 
its 
) ? 
he 


OSCILLATORY AND NON-OSCILLATORY EQUATIONS. 633 


The Non-Oscillatory Case. 


If a solution (and therefore, by Sturm’s separation theorem, every 
solution) of (1) has an infinity of zeros on the half-line (2), then (1) is 
called oscillatory, and otherwise non-oscillatory. Since the zeros of a solution 
(0) cannot cluster at a finite t=, it is seen from (11) and (12) that 
(1) is non-oscillatory if and only if some (and/or every) pair of linearly 
independent solutions, and x—2,.(t), of (1) satisfies 


(19) f dt/{a2(t) + 222(t)} < 00. 


Further information is contained in the following statements: 
(ii) Jf (1) ts non-oscillatory, then, as t—> 
(a) some solution must fail to be O(#4) ; 
(b) every solution can (but need not) be O(#logt) ; 
(c) every solution can (but need not) satisfy ts = O(| x(t)|); 


(d) some solutions must satisfy 


(20) f dt/x2(t) < 


(a) is a consequence of (19); cf. [23]. The negative (parenthetical) 
remarks of (b) and (c) follow from either of the examples f(¢) =—1, 
f(#) =0, since the general solution of (1) then is x(t) =c,e*-+ c.e*, 
x(t) =c, respectively. The positive assertions of (b) and (c) result 
by choosing f(t) = (2¢)~* (for 1S ¢t< o) and observing that the general 
solution of (1) then is x(t) = c,t! + log t. 

The remaining assertion, (d), was proved in [2]. Needless to say, (d) 
is a refinement of (19). It is understood that the lower limit of integration 
in (20) is meant to be large enough, viz., greater than the last zero of the 
particular solution in question. 

Roughly speaking, the above comments could be interpreted by saying 
that, if (1) is non-oscillatory, there exists a solution which, as t—> o, is 
“small” (in this regard, cf. (a)-(b) in (v) below), whereas the other solu- 
tions, those linearly independent of the “small” one, are “large” (in some 
sense). 

Actually, the assertions of (ii) contain about everything that can be 


| 

if 

ne 

he | 

3), 

t) | 

sh. 

B) 


634 PHILIP HARTMAN AND AUREL WINTNER. 


said of the solutions of an arbitrary non-oscillatory differential equation (1). 
This can be seen by constructions similar to the one which will lead to the 
following contingency: 


(iii) A differential equation (1) can be non-oscillatory and nevertheless 
such that there exists an unbounded, open set (7%) which is independent of 
the integration constants and has the following property: As t tends to o 
on (7), every solution of (1) tends to 0. 


Much less than this restricted 0(1)-situation is prevented by (a) in 
(ii) if ¢, instead of being restricted to a suitable set (7), tends to o without 
a restriction. 


In order to prove (iii), choose two monotone sequences of positive | 


numbers, say ko, ki,- and Ao, A1,° *, satisfying 
(21) lim ky = 0 and & kn?/An = O(e-**), t—> 0 
n—> t<n 


(such pairs of sequences exist, as seen by choosing k,n =n and A, = nte™" 
if n>0). For the purposes of (22) below, choose A, greater than 7z. 

In terms of the numbers kn, An, define on the half-line (2) a function 
x,(t) by placing 


1/z,(t) =e if 


where n= 0,1,2,---. It is seen from (22) that 2,(¢) has a continuous 
second derivative on (2) and that 

(23) >0 on (2). 


Hence, if f(t) is defined by 
(24) f(t) =—2,(t)/a(t), 


then f(t) is continuous on (2). The corresponding (1) is satisfied by | 


a—=a,(t); ef. (24). It follows therefore from (23) that (1) is non- 


oscillatory. 
It is readily seen from (21), (22) and (23) that (20) is satisfied by 
z=42,(t). This fact, when combined with (23), means that 


(25) 2, (Ft) f du/zx,?(u) 


as 


| de 
Idi 
| m 
Ww 

= 
th: 
Sin 
the 
tlo 
con 
(27 
on 
| soli 

| (is 
(29 

| ‘ 


ut 


ve 


us 


OSCILLATORY AND NON-OSCILLATORY EQUATIONS. 635 


defines on (2) a function, x(t). Since (1) is satisfied by a= 2,(t), two 
differentiations of (25) show that (1) is satisfied by x = .wz.(t) also. Further- 
more, (25) alone implies that 


(26) vy (t)2’o(t) — x2(t) x’ (t) = — 1, 


which in turn implies that the solutions x,(¢), v2(¢) are linearly independent. 
According to (22), 


du/2,7(u) S2 f + O( S 
as {—> «©. Both terms on the right of this inequality are O(e**), by the 
second of the relations (21). Hence, (25) and the last formula line imply 
that On the other hand, x,(t) =O(e*), by (22). 
Consequently, = O(e')O(e**) and so, in particular, 


lim =0 (to). 
Finally, it is seen from (22) and from the first of the relations (21) that 
lim inf 2,(t) =0 (t> «). 


Since 2,(¢) and 22(¢) are continuous, it is seen from (23) and (14) that 
the last two formula lines complete the proof of (iii). 

The general properties of an arbitrary non-oscillatory differential equa- 
tion (1), which are collected in (ii), can be completed by the following 


comparison theorem: 


(iv) Let f(t), g(t) be continuous functions satisfying 


| (27) f(t) g(t) 


on (2), and suppose that (1) is non-oscillatory. Then, if x= ax(t) ts any 
solution of the differential equation (1), the differential equation 


(28) +9(t)y=0 


(is non-oscillatory and) has some solution y=y(t) = y2(t) satisfying 


(29) y(t) =O(|a(t)|) astoo. 


The parenthetical assertion of this comparison theorem follows, by (27), 
from Sturm’s comparison theorem. 


10 


—_— 


636 PHILIP HARTMAN AND AUREL WINTNER. 


In order to prove (iv), let a(t) he a given solution of (1). Since (1; 


is non-oscillatory, it can be assumed that | 
(30) a(t) >0fori<t< a, | al 

\ 
if is large enough. Let ¢ be on the half-line 4) < t Then, if 
is any function, there exists a unique function z(t) =2,"(t) satisfying (: 
(31) y(t) =2z(t)x(t). 
This makes applicable the method of the variation of constants, as follows: | 8? 
Substitution of (31) into (28) gives | d. 

2x + 22’x’ — 2fx + gzx = 0, | 
(3 

since «” —=— fx, by (1). If this differential equation for z= z(t) is multi- 
plied by the given function r—2(t) (which, according to (30), is an (3 
admissible operation), what results can be written in the form ‘i 
(32) (p(t)2’)’—q(t)z = 0, 
anc 
where p = 2" and q = (f —g)a*; so that, by (30) and (27), ose 
(33) p(t) > 0 and q(t) 2 0. sid 
It is now seen from (31) that (iv) is contained in the following lemma: 

(37 
(iv*) If p(t) and q(t) are continuous functions satisfying (33) on whe 


(2). then (32) has a solution z(t) which is O(1) as t> @. 


(iv*) is known; it can be deduced (cf. [8]) as a corollary of a theorem! z(t 
of Kneser [11], dealing with the particular case p(t) =1. 


An illustration of (iv bis) can be formulated as follows: 


If f(t) S—a/t? holds, for some a=const. > 0, as t— then 
has a “small” solution of the form O(t-’), where b=b(a) >0: | (38 


(34) 2b = (1+ 4a)? —1. 
In order to see this, it is sufficient to choose g(t), f(¢) in (iv bis) to be (39) 


f(t), —a/t®, respectively (if t= 1), and to observe that, if 6 is defined by 
(34), the case f(t) = —a/t? of (1) is satisfied by a(t) =t’. wher 


fact 


on 


em 


be 
by 


OSCILLATORY AND NON-OSCILLATORY EQUATIONS. 637 


The Case of a Non-Oscillatory \-Region. 


Some of the assertions and negations of (ii), (iii), (iv) can be refined, 


, and the construction proving (iii) adapted to more delicate situations, if the 


non-oscillatory character of (1) is so manifest that 
(35) vw’ + (f(t) +A)r=0 


is non-oscillatory for some, sufficiently small, A=const.>0. This is a 
strengthening of the assumption that (1) is non-oscillatory. In fact, the 
parenthetical remark in (iv) implies that, if (35) is non-oscillatory for some 
A, then (35) is non-oscillatory for every A which is smaller than the given 4. 

Accordingly, there belongs to every f(t) a A* having the property that 
(35) is non-oscillatory if 


(36) 


and oscillatory if A* << A < o, where it is understood that both limiting cases 


A* == +oo are included. If A* is called the parabolic point of the A-axis, 
and if A* ~ +o, then it depends on f(¢) whether (35) is oscillatory or non- 
oscillatory for A= .A*. Even in the latter case, some of the following con- 
siderations do not apply at A= A*. 


(v) Suppose that 
(37) <A* =0, 
where r»* denotes the parabolic point of (35). Then 


(a) there belongs to every point of the region (36) a solution 
a(t) =2)(t) #0 of (35) which ts of class (L?) on (2); 


(b) for every X< A*, this x(t) is uniquely determined to a constant 


factor ; 


(c) although, by (a), 


(38) 0< x?(t)dt 0, <A<A*, 


it is not in general true that 


(39) =O(1), t— 


where < 


| 
| 
ti- 
la: 
| 
| 
| 
0 


638 PHILIP HARTMAN AND AUREL WINTNER. 


(d) if (39) holds for some A\=p < d*, then it holds for every ¥ <p; 


(e) af 
(40) x(t) =o0(1), 0, 


holds for some X=p < d*, then it holds for every X< p; 


(f) «if (39) holds for some X=p< A*, then (40) need not hold for 
every A <p; however, 


g) tf (39) holds but (40) does not hold for a pair of values X = dy, ds, 


where Ay A*, then = and 


(41) lim x(t) /ap(t) exists and is not 0 
too 


for—w <A<p< om. 

(a) was proved in [2] (by using (d) in (ii) above). 

(b) -can be concluded as follows: If (b) is assumed to be false, then, 
for some value of A, two linearly independent, hence all, solutions of (35) 
are of class (Z*) on (2); so that (35) is of Grenzpunkt type. But this is 
impossible if (35) is non-oscillatory for some A (cf. [2] and, for a sharper 
result, [6]). This proves (b). 


(d) and (e) are clear from (ivbis). In order to prove (c), (f) and 
(g), the following lemma will first be established: - 


Lemma. If (1) is non-oscillatory and has a solution x = a(t) satisfying 


t 
(42) f x*(t) ( f du/x*(u))dt (for some > 0), 
T T 

then (35) possesses, for every d, a pair of (linearly independent) ee 
2,=2,(t,A) and x. —22(t,A), satisfying 
t i 

(43) z,(t,A) ~a(t) and a2(t,) ~2(E) du/2?(w) | 
| 

as t—» «© ; conversely, if the first relation in (43) holds for some \ 0 and 
for some (L*)-solution x= x(t) of (1) on (2), then the latter satisfies (42). 


Proof of the Lemma. In (28), let g(t) =f(t) +A (so that (28) and 
(35) are equivalent). -The relation (27) will not hold unless A=0; how- 


ever, this is immaterial in what follows. 


| 
| 
is 
( 
(4 
an 
do 
| 
(4 
| if ¢ 
(4 
In 
([ 
[8] 
suff 
is 1 


or 


ng 


OSCILLATORY AND NON-OSCILLATORY EQUATIONS. 639 


The variation of constants, (31), transforms (28) into (32), where 
p(t) =2#°(t) and q(t) =—A2a*(t). 


If the new independent variable 


t 


(44) s= s(t) f du/p(u) du/x?(u) 
T 


is introduced, then (32) becomes 


(45) Q(s)z=0 (+= d/ds), 
where 
(45 bis) = p(t(s))q(t(s)) = —Aat(E(s)) 


and ¢==¢(s) is the inverse function of (44). 
If x(t) is of class (Z*), which is certainly the case if (42) holds, then 
s(t) > 2 ast—>c. In fact, since the harmonic mean of a positive function 


does not exceed its arithmetical mean, 


t 


t 
f du/z?(u) 2 (t—T)?/ f22(u)du. 
T 


T 


Hence, the truth of the assertion, s(0) = 00, is seen from (44). 
Since the function (45 bis) does not change sign, it follows that (45) 


possesses a solution z = z(s) satisfying 


(46) z(s) 


if and only if 


(47) f 


In fact, whether Q(s) does or does not change its sign, a result of Bécher 
({1], $4) supplies the sufficiency of (47) for the existence of a solution 
2—=2(s) of (45) satisfying (46). It is also known (cf. [16], pp. 40-42 or 
[8], Appendix, where Q = 0, and [5], pp. 533-534, where Q = 0) that this 
sufficient condition is necessary as well if Q does not change sign. 

If 0, then (44) and (45 bis) show that (47) is equivalent to (42). 
It therefore follows from (31) that, if a(t) is of class L°(0, 0), then (42) 
is necessary and sufficient in order that (35) has, for some A340 (and/or 


t 
| 
n, 
is 
rd 
). 
1d 


640 PHILIP HARTMAN AND AUREL WINTNER. 


for every A), a solution zz, (t,A) satisfying the first of the relations in 
(43). Finally, the second of the relations in (43) is a consequence of the 
first, since 


du/y*(u) 
T 


is a solution of (35) whenever y is a non-vanishing solution for t= 7. 
This completes the proof of the Lemma. 


Proof of (g) in (v). Without loss of generality, assume that A, = 0, 
hence 0 <A* S o, and that x= -2x(t) is an (L?)-solution of (1) on (2) 
and is O(1), but is not 0(1), ast>o. Let A< 0 and g(t) =f(t) +A. 
Apply the variation of constants, (31), which reduces (28) and/or (35) to 
(32). Then (33) holds (since \< 0). Hence, (iv*) assures the existence 
of a bounded solution z = z(t) of (32). Actually, this bounded solution can 
be chosen in such a way that z(t) > 0 and z(t) [0 (cf. [8]) ; in particular, 
(48) lim z(t) 2 0 exists. 

t> 0 
Since a(¢) is O(1) but is not 0(1), it is clear from (31) and (48) that an 
(Z*)-solution of (28) and/or of (35) is 0(1) if and only if the sign of 
equality holds in (48). 

Accordingly, the assumption of (g), namely, that there exists a 
Ai <A» 0 violating the case AA, of (40), implies that the limit (48) 
is not 0. Consequently, if z(t) is multiplied by a suitable constant, the limit 
in (48) can be chosen to be 1. Then (31) shows that the first relation in 
(43) holds if z,(¢,A) =a)(t) and A=A, ~0. In view of the Lemma, this 
completes the proof of (g) in (v). 


Proof of (c) and (f) in (v). On (2), let 6=¢(¢) be a continuous 
decreasing function satisfying 


(49) f < 


so that ¢(t) > 0, and ¢(t) 0 as 
Let ky, and be two sequences of positive numbers 
satisfying k,n» = 1, An > and 


n+1 
(50)  & (kn?/An) <0 (hence & hn?/An << ©). 
n=1 


n=1 


0 


W. 


her 


| 

0 
is 
| th 
It 
is 
| 
an 
(5 
Bi 
to 

| 

ful 
| r,( 
ch 

| 


TS 


OSCILLATORY AND NON-OSCILLATORY EQUATIONS. 641 


Define on (2) a function z(t) by placing 


u(t) = ¢(t) if either 


(OT) (t) + ka if 


where n=1,2,---. Clearly, x(t) >0 on (2). 

If #(¢) has a continuous second derivative on (2), then the same is true 
of x(t), by (51). Hence, the case x =x, of (24) defines a continuous f(t) 
on (2). For this f(t), the function (51) is a solution of (1), by (24), and 
is of class (L*) on (2), by (49) and (50). 

It will be shown that if ¢(¢) is suitably chosen, then (42) holds for 
the «= defined by (51). 

First, 2 ¢(t), by (51). Hence 


t t 


du/z?(w) = f du/?(u). 


0 


It follows, therefore, from (51) and from Minkowski’s inequality, that (42) 
is satisfied if 


t 
(52) f du/s2(u))at < 
and 
n+T/Xn t 
(53) Sf au/g?(u) at < 


But (53) is a consequence of (50). In order to satisfy (52), it is sufficient 
to choose $(¢) = exp (—?¢*) when 1St< «. For then 


t t 
= f exp (2u’)du = O(1) exp (2#°), 
hence 


f $?(t) ( du/$?(u))dt = O(1) f < 0. 


Accordingly, all of the assumptions of the first part of the Lemma are 
fulfilled. Hence, (35) possesses, for every A, a pair of solutions 2,(f,A), 
x,(t,r) satisfving (43). 

In order to complete the proof of (c) and (f) in (v), it is sufficient to 
choose ky, =n and ky, = 1, respectively. In fact, it is clear from (51) that 


n | 
), 
) 
rN. 

n 
0 
of 
a | 
it 
in 

| 

0 1 


642 PHILIP HARTMAN AND AUREL WINTNER. 


x(t) is not O(1) in the first case, and that x(t) is O(1) but is not 0(1) in 
the second case .Finally, it is seen from (43) that 2(¢) can here be replaced 
by a(t) (¢,A) in both cases, where — 0 <A< o. 


The Parabolic Case. 


Before (v), the parabolic case, A =A*, was defined as follows: (35) is 
oscillatory or non-oscillatory according as A > A* or A < A*, where A* ~A + om. 
It will now be shown that, in contrast to the positive assertions of (v), which 
assume (36), anything can happen when A = A*. 


(vi) Suppose that X%*A + o, where r* denotes the parabolic point of 
(35). Then the case X = X* of (35) can be 


(a) oscillatory. (b) non-oscillatory 


and it can have on (2) 


(a) an (L*)-solution (8) no (L*)-solution. 


In fact, all four possibilities 
(54) (az). (ag), (ba), 


can be realized. What cannot occur is that the case X= A* of (35) be such 
as to have two, linearly independent, (L*)-solutions on (2). 

First, whether A* 4 + oo is or is not assumed, (35) cannot have two, 
linearly independent, (Z*)-solutions on (2) for any A unless the same is true 
for every A. This is Weyl’s theorem (cf. [17], p. 238) on which his alterna- 
tive of “ Grenzpunkt case or Grenzkreis case” depends (for a sharper result, 
establishing an explicit asymptotic connection between the solutions belonging 
to two arbitrary A-values in the Grenzkreis case, cf. [20], pp. 266-267). 
Accordingly, the last assertion of (vi) follows from assertion (b) of (v). 

Next, since 7” + Ar =0 is oscillatory or non-oscillatory according as 
A> 0 or A < 0, it follows from Sturm’s comparison theorem that if 


(55) f(t) as to &, 


then (35) is oscillatory or non-oscillatory according as A >0 orA <0. In 
other words, (55) is sufficient for A*—0: In particular, if f(t) = Ct’, 
where C is any real constant (and, for instance, 1 t < o, whilst f(t) =@ 
when <1), then A*—0; so that the case A—A* of (35) becomes 


(56) 2” +. Ct-*z = 0. 


( 
] 
t 
T 
a 
| - 
I 
di 
Ww 
in 
m 
te 
to 
tio 
(w 
(5! 


d 


OSCILLATORY AND NON-OSCILLATORY EQUATIONS. 643 


It is easy to see that, by suitable values of C, each but the first of the four 
cases (54) can be realized by (56); cf. [10], pp. 415-416. 
In fact, (56) is satisfied by 2(¢) = ¢?¢ if c is a root of a quadratic equation, 


(34 bis) 1+ (1—4C)!. 


If }<C< o, the two roots ¢ are distinct, complex and have the real 
part $; so that (56) is oscillatory and, since ¢ is not of class (L*) on (2), 
the second of the possibilities (54) takes place. If —o <C< 4, the two 
roots are real and distinct; furthermore, the smaller c is less than — $ if 
and only if —o« <C <—#. Since a(t) =¢¢ is a solution, it follows that 
(56) realizes the third or the fourth of the possibilities (54) according as 

The remaining possibility, that which is the first in (54), is more delicate. 
Its occurrence will now be constructed by an adaptation of the “ Riccati” 
method used in [20], p. 268 (where the A-value, instead of being at a sharply 
defined parabolic point, is well within an “elliptic” range); cf. also [22]. 


Proof of (ax) in (vi). Let (1) be replaced by 1 St < o, put 
(57) x(t) =r(t) cos log é, 


where, for the present, the function r(¢) is undetermined, and substitute (57) 
into (1). This gives 


(58) f(t) = /r(t) + /r(t) — ( tan log 


(58) being computed from (57) and (1) as f=—2”’/r. 

In order to make the function (58) continuous on the _half-line 
1=t< o@, the infinities of the factor tan log?¢, which cluster at [= o, 
must be neutralized by an appropriate choice of the first factor of the last 
term of (58). Since that first factor is the derivative of 


2 log r(t) — log t = log r?(t) /t, 


such a neutralization results by choosing the logarithmic derivative of r?(t¢) /t 
to be of the form x(¢) cos? log ¢, where x(¢) is an arbitrary continuous func- 
tion on the half-line 1=t < o. , If y(t) =— 2yt", where y is a constant 
(which will be chosen below), this gives 

t 
(59) r(t) = exp(—y fe cos? log s ds). 


1 


is 
h 
rf 

| 


644 PHILIP HARTMAN AND AUREL WINTNER. 


If 7’(t) and r’(¢) are computed from (59), it is found that (58) 


reduces to 
(60) 4t°f(t) = 5 — 4y? cos* (log ¢) — 8y sin (2 log ¢). 


Clearly, (60) defines, for 1St< «, a continuous f(t), and (57), with 
(59), represents a solution of (1) for this f(f¢). 

Since (55) is satisfied by (60), and since (55) is sufficient for A* = 0, 
the parabolic case of (35) becomes (1). Hence, the possibility (az), claimed 
in (vi), will be realized if it is shown that (1) now possesses a solution 
which has an infinity of zeros and is of class (L*). But 


t log t 
cos* log s ds = f cos* s*ds* == $t* + }sin 
1 0 
where s* = logs, /* =log?t. Hence, from (57) and (59), 


u(t) = (cos log t)t#47 exp(— 4y sin 2 log ft). 


Consequently, x(t) is of class (Z7) on the half-line 1S < © if the con- 
stant y, which thus far was arbitrary, is chosen so as to satisfy } — Jy < —}. 
Since the last formula line also shows that 2(¢) acquires an infinity of zeros 
as t—» «, the proof is complete. ° 

Since the oscillating function (60) is O(1) as t— oo, the f(t) defined 
by (58) is of the same order, ¢-*, as the coefficient function of (56). On 
the other hand, whilst the trivial family (56) sufficed for the realization of 
all but the first of the four possibilities (54), the delicate construction leading 
to (58) was needed for the realization of the first of these possibilities. This 
situation is explained by the following criterion: 


(vibis) Case (ax) of (vi) cannot occur if f(t) 1s monotone for large t. 


Clearly, it is sufficient to prove this under the assumption that f(t) is 
monotone throughout, i.e., on (2). 

First, it is clear from Sturm’s comparison theorem that, if f(¢) tends 
to a limit, f(0), then (35) is non-oscillatory for every A, oscillatory for 
every A, or non-oscillatory for every A < f(o) and oscillatory for every 
\ > f(%), according as f( 0) =—,f(«) + Hence, 
A* =f() holds in all three cases. Since (vi) assumes that A*+ + 0, 
and since the constant A* can be subtracted from f(t) in (35) if the origin 
of the A-axis is translated by A*, it follows that (vibis) need be proved only 
under the assumption f(o) —0, and that the case A=A* of (35) then 


of 


( 

a 
[ 

i] 

p 

n 

n 

a 

( 

| 

de 

al 


») 


OSCILLATORY AND NON-OSCILLATORY EQUATIONS. 645 


becomes (1). In addition, f(¢) cannot change sign, since f(¢) is monotone. 
If f(t) =0, then, according to Sturm, (1) is in case (b) of (vi), and so 
(vibis) does not claim anything. Let therefore f(t) 20. Then df(t) =0, 
since is monotone and f(0) =0. 

Suppose that (1) is in case («) of (vi). Then, since f(t) = O(1) as 
t—» o«, the (Z7)-solution in question is subject to 


a(t) as tow; 


ef. [19], p. 8. Suppose, in addition, that (1) is in case of (a) of (vi). 
Then, if t; < t2<: +--+, where as n—> o, denote the zeros of 2(t), 
and if the maximum of | x(¢)| on the ¢-interval (ti4,tn) is | @(tn*)|, 


| a(t,*)| S| where 0 as n> ©. 


In fact, these inequalities follow from the assumption df(t) S0; ef., e.g., 
[22]. 

Since the last two formula lines are contradictory, it follows that (a) is 
incompatible with (a). This proves (vi bis). 


Remark. It is seen from the relevant part of the proof of assertion (c) 
of (v) that, if f(t) is wobbly enough as t— o, the determination of the 
parabolic A-value, A*, of (35) can become quite difficult. That the task 
must depend on “ transcendental ” means, can be illustrated by the analytical 
machinery (such as infinite determinants) which, in view of the following 
criterion, must be mobilized if the determination of A* is required in such 


a comparatively harmless case as 
(61) f(t) = f(t + 1) 


If f(t) ts periodic on (2), then (35) has a parabolic d-value, 
A=A* At oo, which can be characterized as follows: X* is the least A-value 
corresponding to which (35) possesses a solution having the same period 


as f(t). 


The proof will be omitted, since it is between the lines of the known 
description ([9], p. 280; [12]) of the distribution of A-intervals of stability 
and instability in the case (61). 


Spectral Formulations. 


In what follows, the terminology will be the same as in the latter part 
of the Proof of (c) in (v) above. 


d 

l- 

d 

n 

g 

is 

is 

ls 

yr 

Vv 

1 

d | 


646 PHILIP HARTMAN AND AUREL WINTNER. 


Suppose that 


(62) — o Slimsupf(t) < o. 


Then (35) is the Grenzpunkt type ({17]; for a sharper result, cf. [6]). It 
was shown in [7] that, under the assumption (62), 


(1) AO must be in the spectrum determined by (35) and a boundary 
condition, az(0) + ba’(0) =0, if (1) has at least one solution, x= <x(t), 
which satisfies this boundary condition and is O(1) as t—> ; and that 


(II) AO must be in the (invariant) essential spectrum of (35) if 
(35) has two, linearly independent, solutions which are O(1) as t> o. 


It turns out that the respective sufficient conditions of (I) and (IL), 
represented by the O(1)-assumptions, are not necessary as well (in either 
case) ; not even if (62) is strengthened to (3). This follows from the actual 
occurrence of both possibilities realized by (i) above. In fact, if f(¢) satisfies 
(3) and if the limit (3) is denoted by », then, as shown in [3], the essential 
spectrum of (35) consists of the half-line —p=A< o. 


Remark. In the limiting case (9), where »= o, it is no longer true 
that (35) must be of Grenzpunkt type, and so the limiting case of the theorem 
quoted at the end of (A) ought to be formulated as follows: If f(t) satisfies 
(9) and is such that (35) is not of Grenzkreis type, then the essential spec- 
trum of (35) consists of the entire line, —0o <A< wo. We did not prove 
that this limiting case of the theorem quoted is true. On the other hand, it 
is surely true in the other limiting case, in that belonging to »—=— oo. In 
fact, what is then claimed is that the essential spectrum consists of the points 
d satisfying —pSdA if »—— oo. But this is equivalent to the statement 
that the essential spectrum is vacuous (i.e., that the Green kernels belonging 
to (35) are completely continuous), if 


(9 bis) f(t) 
And this assertion is known to be true; it is contained, as a limiting case, 


in a theorem of Weyl, which will be quoted at the beginning of (B) below 
(and which his proof establishes for the limiting case, 4 —=-— o, also). 


(B) Suppose that (35) has a parabolic point, A* 4 + , and choose, 
without loss of generality, A* to be 0. Thus, with reference to any boundary 
condition, az(0) + ba’(0) =0, the number of those points of the point 
spectrum which are to the left of A* is finite or infinite according as (35) is 


a 

a 

| 

p 

n 

0 
Zé 

( 

b 

S] 

( 

| 

0 
al 
p 

tl 

a 


OSCILLATORY AND NON-OSCILLATORY EQUATIONS. 647 


non-oscillatory or oscillatory at 4=A*. In either case, those points of the 
spectrum which are to the left of A* belong to the point spectrum only. 
According to Weyl [17], pp. 252-256, all of this can be concluded by an 
adaptation of a classical argument of Sturm. 

As shown in [2], a proof of this theorem of Weyl can be adjusted so 
as to avoid assumption. (62). What is actually needed in the Sturmian 
argument is the more general proviso that the differential equation be of 
Grenzpunkt type. But Weyl deals with the arbitrary self-adjoint case, 


(1 bis) (p(t)a’)’ + (F(t) +a) =0, 


where p(¢) is any positive, continuous function on (2), rather than with the 
particular case (1), where p(t) =1. In the latter case, the proviso is not 
needed, since A* > — oo alone implies that (1) is of Grenzpunkt type. Cf. 
[6], where a sharper theorem is proved. 

Due to the above criterion concerning that portion of the point spectrum 
which is to the left of A*, the f(t) defined by (60) leads to a peculiar 
situation. In fact, since (55) is satisfied, A*—0. But (57) is a solution 
of (1), ie, of the case A=A* of (35). Since (57) has arbitrarily large 
zeros, it follows that those points of the spectrum which are to the left of 
0 form an infinite sequence which tends to 0. 

This holds for the point spectrum belonging to any boundary condition. 
Caleulate 2(0), #’(0) from (55) and (57), and consider that particular 
boundary condition which is satisfied by the resulting initial values x(0), 
2’(0). Then, since (59) makes (57) a function of class (Z*), the point 
spectrum will contain A=0. Accordingly, 0 is in the point spectrum. 
At the same time, A 0 is a cluster point of the point spectrum. 


In addition, A= 0 is in the continuous spectrum also. In fact, 


(63) @, 


since f(t) = O(t*), by (60). But (63) is known to imply that every point 
of the closed half-line 0=A< o is in the continuous spectrum of (35) 
and, incidentally, that no point of the open half-line 0 <A < © is in the 
point spectrum. A simple proof results by observing that, if (63) is satisfied, 
then, according to Bécher [1], two linearly independent solutions of (35) 
are of the form x, = cos wt + 0(1), =sin wf + 0(1) or = e** + o(e”*), 
= +. 9(et) according as A << 0 or A> 0, where =| A |? and 
in both cases. On the other hand, in order that two linearly independent 


It 
e 
Ss 
e 
t 
1 
Ss 
t 


648 PHILIP HARTMAN AND AUREL WINTNER. 


solutions of (35) be of the form z,—1-+0(1), a —t-+0/(t) in the para- 
bolic case A = 0, Bocher requires 


(63 bis) f | tf(t)| dt < ©, 


rather than just (63). This explains why A = 0 can be in a point spectrum 
belonging to the above example, f(¢) being just about ¢-*; cf. (60). 


(@) Let f(t) be any function for which (35) becomes of Grenzpunkt 
type, and let A be any fixed value not contained in the essential spectrum 
(provided that there exists such a A). Then (35) has a solution, z= 2(¢), 
which is of class (Z?) on (2) (and, to a constant factor, is uniquely deter- 
mined) ; ef. [4]. 


It was shown in [24] that such an 2,(f) must be “ very small,” 
x(t) = O(t-’) for every fixed N, 0), 


as soon as (35) is of Grenzpunkt type by virtue of (62). It turns out that, 
if (62) is replaced by another Grenzpunkt criterion, then not only can 2(t) 
fail to be “very small,” and not only can (39) become false, but even the 
following situation is possible: (35) is of Grenzpunkt type and has no essential 
spectrum, but the (L?)-solution 2,(¢), which exists for — 0 <A< o, fails 
to be O(1) at any X. 

This is accomplished by the proof of (c) in (v). In fact, the f(t) 
constructed there was such as to make (35) non-oscillatory for every A. 
Since this means that A* = o, it implies that (35) is of Grenzpunkt type 
and has no essential spectrum; cf. [2]. Finally, it was seen in the proof of 
(c) in (v) that (39) is violated at every X. 


THE JoHNS HopPpKINS UNIVERSITY. 


REFERENCES. 


[1] M. Bocher, “On regular singular points of linear differential equations of the 
second order whose coefficients are not necessarily analytic,” Transactions 
of the American Mathematical Society, vol. 1 (1900), pp. 40-53. 

[2] P. Hartman, “On differential equations with non-oscillatory eigenfunctions,” Duke 

Mathematical Journal, vol. 15 (1948), pp. 697-709. 

, “On the spectra of slightly disturbed linear oscillators,” American Journal 

of Mathematics, vol. 71 (1949), pp. 71-79. 


[3] 


| 
| 


le 
LS 


OSCILLATORY AND NON-OSCILLATORY “EQUATIONS. 649 


——— and A, Wintner, “ An oscillation theorem for continuous spectra,” Pro- 
ceedings of the National Academy of Sciences, vol. 33 (1947), pp. 376-379. 

——— and A, Wintner, “ On non-conservative linear oscillators of low frequency,” 
American Journal of Mathematics, vol. 70 (1948), pp. 529-539. 

——— and A. Wintner, “A criterion for the non-degeneracy of the wave equa- 
tion,” ibid., vol. 71 (1949), pp. 206-213. 

——— and A. Wintner, “On the location of spectra of wave equations,” ibid., 
vol. 71 (1949), pp. 214-217. 

—— and A. Wintner, “On the Fourier-Laplace transcendents,” ibid., vol. 71 
(1949), pp. 367-372. ; 

O. Haupt, “ Ueber lineare homogene Differentialgleichungen 2. Ordnung mit 
periodischen Koeffizienten,” Mathematische Annalen, vol. 79 (1919), pp. 
278-285. 

A. Kneser, “ Untersuchungen iiber die reellen Nullstellen der Integrale linearer 
Differentialgleichungenssysteme,” ibid., vol. 32 (1893), pp. 409-435. 
———, “Untersuchung und asymptotische Darstellung der Integrale gewisser 
linearer Differentialgleichungen bei grossen reellen Werthen des Argu- 
ments, I,” Journal fiir die reine und angewandte Mathematik, vol. 116 

(1896), pp. 178-212. 

H. A. Kramers, “ Das Eigenwertproblem im eindimensional periodischen Kraft- 
felde,” Physica, vol. 2 (1935), pp. 483-490. 

H. Milloux, “ Sur l’équation differentielle + «A (t) = 0,” Prace Matematyczno- 
Fizycene, vol. 41 (1934), pp. 39-54. 

O. Perron, “Ueber ein vermeintliches Stabilititskriterium,” Gé6ttinger Nach- 
richten, 1930, pp. 28-29. 

E. Schrédinger, “ Verwaschene Eigenwertsspectra,” Siteungsberichte der Preus- 
sischen Akademie der Wissenschaften, 1929, pp. 668-682. 

H. Weyl, “ Ueber gewéhnliche Differentialgleichungen mit singularen Stellen und 
ihre Eigenfunktionen, Géttinger Nachrichten, 1909, pp. 37-63. 

——., “Ueber gewénliche Differentialgleichungen mit Singularitaéten und die 
zugehérigen Entwicklungen willkiirlicher Funktionen,” Mathematische 
Annalen, vol. 68 (1909), pp. 220-269. 

A. Wintner, “ The adiabatic linear oscillator,” American Journal of Mathematics, 
vol. 68 (1946), pp. 385-397. 

———, “ (L?)-connections between the potential and kinetic energies of linear 
systems,” ibid., vol. 69 (1947), pp. 5-13. 

———, “ Asymptotic integrations of the adiabatic oscillator,” ibid., vol. 69 (1947), 
pp. 251-272. 

——., “On the normalization of the characteristic differentials in continuous 
spectra,” Physical Review, vol. 72 (1947), pp. 516-517. 

——., “Stability and high frequency,” Journal of Applied Physics, vol. 18 
(1947), pp. 941-942. 

——.,, “A criterion of oscillatory stability,” Quarterly of Applied Mathematics, 

vol. 7 (1949), pp. 115-117. 

. “On the smallness of isolated eigenfunctions,” American Journal cf 

Mathematics, vol. 71 (1949), pp. 603-611. 


| [4] 
| [5] 
[6] 
[7] 
n 
[8] 
191 
n 
| [10] 
[11] 
[12] 
a [13] 
[14] 
[15] 
[16] 
) [17] 
[18] 
[19] 
[20] 
[21] 
[22] 
[23] 
[24] 
e 
l 


A SEPARATION THEOREM FOR CONTINUOUS SPECTRA.* 


By Puinie HartMan and AUREL WINTNER. 


On the half-line 0St < @, let p=p(t) >0 and gq—q(t) be real- 
valued, continuous functions, and consider real-valued, non-trivial (+<0) 
solutions of the differential equation 


(1) (pa’)’ + (¢+A)x=0, 


where A is a real parameter. Suppose that (1) is of Grenzpunkt type, i.e., 
that. (1) does not possess two linearly independent solutions both of which 
are of class (1°), that is to say that 

(2) J 2?(t)dt =1. 

0 
According to Weyl [4], p. 238, this requirement will be satisfied for every A 
if it is satisfied for a single A. Since what is required is that not every 
solution of (1) be of class (L*), the assumption requires that (1) and a 
linear, homogeneous boundary condition at such as or 
v’(0) =0 (more generally, 


(3) xr(0) sin a— p(0)a’(0) cos a= 0, 


where x is a real number) should determine an eigenvalue problem. 

In particular, there belongs to every choice of « in (3) a spectrum, 
S(a), and in it a point spectrum, P(a) (which can be vacuous, whilst S() 
cannot). According to Weyl [4], p. 251, the set of the cluster points of 
S(a) is independent of « and can therefore be denoted simply by 8’. Thus 
S(a) = 8’ + P(a), where it is understood that S’ and P(«) can have points 
in common. Clearly, P(«) = P(«-+-7); cf. (3). Conversely, a point A 
belongs to both P(«) and P(B) only if «=f (modz). For, on the one 
hand, a given A is in P(a) if and only if (1) has, for this A, an (L?)- 
solution satisfying (32) and, on the other hand, (1) should not have, for 
this (or for any) A, two linearly independent (Z*)-solutions. 

Let S* denote the set of those points of the line — oo <A< » which 
are not in S’. Then, since S’ is a closed set (possibly the entire line), the 


* Received October 25, 1948. 


650 


Se 

in 

of 

T 

fo 

is 

de 

m 

be 

at 

fu 

of 

an 

CO] 

val 

| noi 

co? 

dis 

mo 

cla: 

bili 

con 

is s 


A SEPARATION THEOREM FOR CONTINUOUS SPECTRA. 651 


set S* is open (possibly vacuous). It was shown in [2] that, if A is in S*, 
then (1) has a solution satisfying (2). This means that, if S°® denotes the 
interior of (i.e., the greatest open set contained in) the set of those A-values 
corresponding to which (1) has a solution satisfying (2), then S* is a subset 
of S°. We did not succeed in proving that S° is always identical with S*. 
The nature of the difficulties involved will be clear from the type of the results 
which will be proved in the Appendix below. 

Since S* is in S°, there belongs to every A contained in S* an a= a 
for which A becomes a point of P(«), and this «= a) is unique mod x (for 
every A in S*). Suppose that S* is not vacuous (i. e., that not every real A 
is in S(a) for some and/or every a), and let S* =3J; be the canonical 
decomposition of the open set S* into a (finite or infinite) sequence of 
mutually disjoint, open intervals (one of which can be a half-line). It will 
be shown below that. if the indeterminacy (mod 7) of a is suitably normalized 
at every point A of S*, then 


(1) the a=) for which » becomes a point of P(a) ts a continuous 
function of the position, A, on S* = 3J;,; what is more, 


(ji) the continuous a is a regular function on every J; in addition, 
(iii) the continuous a is a (strictly) increasing function on every J. 


At the end of the paper, a fact corresponding to (ii) will result concerning 
the eigenfunctions themselves. It is understood that, in (11), the regularity 
of a, on the real A-interval J; means that %) possesses a (direct and regular) 
analytic continuation from J; into the upper and lower half-planes of the 
complex variable i. 

Clearly, (i) and (iii) imply the following 


SEPARATION THEOREM. If Xd’ and X”, where X’ < Xr”, are (isolated) eigen- 
values of any boundary condition (3) and if the open interval (X’,Xr”) does 
not contain any point of the spectrum of that boundary condition, then (X’, Xr”) 
contains exactly one point of the spectrum of every boundary condition (3) 
distinct from the given one. 


The need for such a separation theorem, a theorem depending on the 
monotony of a, was formulated in [2] as a desideratum. The use of a 
classical argument of Sturm (cf. [4], pp. 252-255) is prevented by the possi- 
bility that two (A’, \”)-intervals can be separated by points of 8’, e.g., by a 
continuous spectrum. That such a case of “band spectra” can actually occur, 
is shown by the example of any periodic g(t) #const. in (1), with p(t) =1. 


11 


) 
] 


652 PHILIP HARTMAN AND AUREL WINTNER. 


Proof of (ii) and (iii). 


(i) is contained in (ii). In order to prove (ii) and (iii), it will be 
sufficient to show that. if A» is a given point of S*, the function a, is regular 
and strictly increasing on some, sufliciently small, open interval containing Ap. 

Since (ii) requires the consideration of complex A-values also. let (2) 


be reworded as 
(4) f | a(t)|2dt—=1. 


According to Weyl [4], p. 238, there belongs to every non-real a solution 
of (1) satisfying (4), and this solution is unique to a constant factor, (1) 
being of Grenzpunkt type. 

Let A» be a given point of S*. Denote by 2 an a-value corresponding 
to which P(a) contains A», and normalize by —4r 2° < $x. Let 6 be 
any real number satisfying 654° (mod z). Then, since Ag is in the (open) 
complement of 8’, a sufficiently small open interval containing A, will not 
contain any point of S(@). For, on the one hand, Ao is in P(a°), where «#9 
(mod x) and, on the other hand, $(6), hence P(6), does not cluster at Ao. 
Hence, there exists about the point A, of the complex A-plane a sutfficiently 
small circle having the property that, corresponding to every A within the 
circle, (1) has a solution satisfying (4), and this solution is unique to a 
constant factor and does not satisfy (36). 

It follows that, if A is any point of a sufficiently small circle about Ao, 
there exists a unique (complex) number / —/1(A) having the property that 


satisfies the differential equation (1), the (Z7)-condition 


(6) 0<f | €(t,A)|*dt << @ 
0 
and the initial conditions 
(7) &(0,A) =—sin 6 4+ 1(A) cos 8, p(0)€’(0, A) = cos 6 + 1(A) sin 8, 


where 6 is the real constant fixed above. In fact, this definition of /(A) 
is equivalent to the following representation of the (Z?)-solution (5) : 


(8) E(t,4) = A) + 1(A)a2(t, d), 


| 
( 
( 
f 
L 
fi 
tl 
( 
(5) é(t, A) 
tio 
the 
| is 
(1 


hat 


A SEPARATION THEOREM FOR CONTINUOUS SPECTRA. 653 


where 2, %2 is that pair of linearly independent solutions of (1) which is 
determined by the initial conditions 


(91) x,(0,rA) = — sin 8, p(0)2’,(0, A) = cos 0; 
(92) £2(0,A) = cos 8, p(0)a’2(0,A) = sin 6. 


According to Weyl [4], pp. 226-227, the function /(A) or the resulting 
analytic continuation is regular in the open upper (and lower) A-half-plane. 
Clearly, 1(A) is real-valued along the real diameter of the circle. 

In contrast to the above definition of (5), let 


(10) x= 2(t, d) 
denote that solution of (1) which belongs to the initial conditions 
(11) 2(0,A) =—sin 6+ cos 6, (0) (0, A) = cos 6 1(Ao) sin 4. 


Thus x(t, Ao) = €(t, Ao), by (7), but, in contrast to the function (5), the 
function (10) is not in general of class L?(0,0). It becomes of class 
[?(0, 00) at A=Ao,A1,A2," - +, Where the latter sequence (which can be 
finite but, since it contains A», cannot be vacuous) consists of the points of 
the point spectrum P(«°). Put 


(12) = f An) dt, where 0 < yn < (nz 0). 
0 


Besides the eigenfunctions x(t, An), use will have to be made of Hellinger’s 
eigendifferentials x(t, A)d¢(A) ([3], pp. 252-258) which, according to Weyl 
([4], pp. 239-251), can be introduced, with reference to (1), as follows: 
There exists on the line — «0 <A< o a continuous, non-decreasing func- 
tion ¢ = (A) which is constant on each of the components of the open set 
S* and has the following properties: For every fixed », where — 0 < p< 


the function of ¢ 


(13) B(t,n) — 2(t,a)dg(a) 


is of class L?(0, co), satisfies the homogeneous integro-differential equation 


(14) + (a(t) + = 


lar 
2) 
ion 
1) 
ing 
be 
not 
tly 
the 
Xo; 
| 

0 


654 PHILIP HARTMAN AND AUREL WINTNER. 


[which corresponds to (1) if, besides A=, an infinitesimal vicinity of this 
A = w is also taken into account] and, when combined with the eigenfunctions, 
supplies a complete ortho-normal set. According to Hellinger (loc. cit.), 
the last condition means that, if f(/) is any continuous function (or Baire 
function) of class 17(0, 0), then Parseval’s relation, 


holds for the Fourier constants belonging to the contributions of the point 


spectrum and the continuous spectrum together, i.e., for 


(16) c(An) -f f(t)x(t, An) dt/dn [ef. (2), (12)] 
and 
(17) f(t)®(t, 


Parseval’s relation, (15), will be needed for the f(t) represented by (5) 
(for a fixed A+4A,). In order to calculate the corresponding Fourier con- 
stants, (16) and (17), use will be made of the following fact: If D denotes 
the differential operator of the case A=0 of (1), ie. if. 


(18) D(f) = (pf’)’ + 


and if f, g are of class (L7) on 0 St < ©, have continuous derivatives f’, g’ 
which make the products pf’, pg’ absolutely continuous locally and the func- 
tions D(f), D(g) of class (LZ?) on 0OSt< o&, then 


(19) Wrg(t) ~0 as 
where wy (t) denotes the Wronskian, 
(20) = (f9— 9T)P. 


First, if «="f and «—g are real-valued and, besides satisfying the 


conditions just mentioned, satisfy a common boundary condition, (3%), then | 


(19) follows from the relation 


th 
th 
th 
da 
fir 
cle 


in 


Bu 


era 


Cor 


| ( 
| 
= 
| 
0 0 


C- 


nN 


A SEPARATION THEOREM FOR CONTINUOUS SPECTRA. 655 


and from the usual Green identity. The relation (21) was proved by Weyl 
[4], p. 242; it expresses the self-adjoint character of the problem. Next, if 
the last italicized proviso, without which (21) is not valid, is omitted, then 
(19) remains true, since, in view of (20), the relation (19) has nothing to 
do with the behavior of f(t), g(t) near = 0. Finally, it is clear from (20) 
that, (19) being true for real-valued f, g, it must be true for complex-valued 
f, g also. 
Let (19) be applied to the pair of functions 


(22) f(t) =E(LA), g(t) = 


(which, in view of the defining properties of the functions (5), (13), is 
permissible). Since (5) and (13) are solutions of (1) and (14) respectively, 
(22), (20) and an application of Green’s identity give 


the first term, 0, on the right being wy,(0), by (19). In order to evaluate 
the second term, — w;,(0), on the right, it is sufficient to observe that, on 
the one hand, the function ®(¢, 4) of ¢ satisfies the same homogeneous boun- 
dary condition as does the function (10), and that, on the other hand, the 
first of the functions (22) satisfies the initial condition (7). Hence it is 
clear from (11) that (20) reduces at {= 0 to 
= {1(Ao) —1(A) 

in the case (22). 

In view of the last two formula lines, the “ continuous ” Fourier constant, 
(17), of f(t) =€(t, A) is subject to the identity 


= 1A) — 100) }6@) + 


But (¢, 0) az 0, by (13), hence C(0) = 0, by (17), and so a partial inte- 


gration gives 


(n—r) O(n) — CO)dr— 


Consequently, 


0 


1s 
re 
nt | 
0 0 0 
) 
l- 
gq’ 
= 
0 
1e | 
=| 
} 


656 PHILIP HARTMAN AND AUREL WINTNER. 


This relation means that the Stieltjes-Hellinger differentials of the “‘ con- 
tinuous Fourier constants,” (17), of f(t) =€(t,A) are given by 


(uw —d) dO = {U(A) —1 (Ao) 


A similar calculation shows that the corresponding “ discrete” Fourier con- 
stants, (16), of f(t) —€(t,A) are given by 


(An —A)e(An) = {1(A) — 1 (Ao) 


the normalizing factor, 1/yn, on the right being supplied by the integral (12). 
Accordingly, (15) becomes 


n 

In addition, it is clear from the above evaluations of the terms of Parseval’s 
relation of f(t) = &(t,A) that 


| x(t, do) (A — Ao) PoE A) /{U(A) — 1(Xo) } | dt 


| 

The last relation supplies the basis for the proof of (ii) and (iii). 
First, since the eigenvalues A, do not cluster at Ao, and since $(A) is 
constant on an open interval containing Ao, the expression on the right of 
the last equality tends to 0 as A> Ao. Hence, the same is true of the integral 

on the right. Consequently, as A — Ao, 


T 
|a(t,A0) + yok (6,2) — 100) }| dt 0 
0 
holds for every positive 7. From this, it will now be deduced that 
(24) 1* (Xo) —1/o, where 1*(A) = dl(X) /dA 


(whilst ’ = d/dt). 


Proof of (24). Two cases will be distinguished, according as there does _ 
or does not exist a y corresponding to which it is possible to choose a sequence | 
of real or complex numbers f; satisfying 


sec 


| 
( 
} 
t 
0 
A 
he 
CO 
an 
[ 
it 


n- 


of 
al 


eS 


A SEPARATION THEOREM FOR CONTINUOUS SPECTRA. 657 
(25) 1( Bi) —T(Ao) y and By as k > 00. 
In both cases, use will be made of the identity 
(26) E(t, A) —a(t,rA) = {1(A) —1(Ao) }t2(t, A) 


which, since (5) and (10) are solutions of (1), and since a solution x(t) 
of (1) is uniquely determined by a pair of initial constants 2(0), 2’(0), 
is clear from (7), (92) and (11). 

In the first case, that is, if there exist values y, Bx satisfying (25), the 
identity (26), when combined with the local continuity of the general solution 
of (1), shows that, as k > oo, the limit relation 


E(t, Bx) > x(t, Ao) + Ao)” 


holds uniformly on every fixed interval ¢=T < In view of (23), 
this implies that, for every positive T << «, 


T 
lim f | E(t, Bx) |? dt exists (54 
k> & 


It follows therefore from (25) and from the case 1 =f, of (23), where 
A\— Ao, that either the value of the constant y is 0 or 


holds for every 7. In other words, either y—0O or r(t,A.) =0. Since 
Ao) is an eigenfunction, it follows that y = 0. 

In the second case, specified before (25), there does not exist any y 
corresponding to which it is possible to choose a sequence f;, B2,* - - satisfying 
(25). Hence, the second case is characterized by | 1(A) —I(Ao)|—> © as 
A—> Ao, where A varies freely (near A,). It follows therefore from (26) that 
r2(t,rA) > 0, as A— Ao, holds for every ¢. But this means that z,(t, A.) =0 
and contains, therefore, a contradiction. 

Accordingly, the second case is impossible, and the first case leads to 
y=0. In view of the definition of the two cases, specified after (25), 
it now follows that, when A varies freely, 1(A4) —1(Ao) 0 as A— Ao. Con- 


sequently, from (26), 
T 


f | a(t, A) —&(t,A)|? dt 


0 


| 
a 
Jf | —0 

| 

| 


658 PHILIP HARTMAN AND AUREL WINTNER. 


as A—> Ao. Hence it is clear from (2) that 
(A — Ay) (A) 1(Xo) —>—lasaA Xp. 
This proves (24), since yo 0, by (12). 


Completion of the proof of (ii) and (iii). Since the sequence B,, 
occurring in (25) was allowed to be complex, the above proof of (24) implies 
that the derivative of /(A) at the (real) point AA, exists even if AX 
in [1(Ap + AA) —1(Ao) ]/AA, where AA > 0, is complex. In other words, the 
function /(A) of the complex variable A is differentiable at the point A = A, 
of the real axis. On the other hand, /(A) is differentiable at every non-real A, 
since it is clear from Weyl’s theory ([4], pp. 226-227) that J(A) is regular 
in the open upper and lower halves of the A-plane. But Ay was chosen to be 
any real number not contained in S’. Consequently, 1(A) is differentiable 
at every point of a circle in the complex A-plane having as its center the real 
point A=A,. Hence. /(A) is regular at A= Ap. 

From now on. A will be assumed to be real and near Ay. Then /(A) is 
real-valued. Furthermore, the function /(A) is decreasing at A = Ao, by (24) 
and (12). Since /(A) is regular at A= Apo, it is decreasing near A = Av also. 
Finally, /(A) is continuous at and near AA». It follows that if 2° and 4 
are the constants chosen, with reference to A», at the beginning of the section 


which follows (4), then 


(27) ay = dn + arc tan 
and the initial determination = 2° define, on an open interval about Ap, 
a unique continuous function, %). 

According to (27) and (7), 

E(0. A) sin — p(0)E€’(0, A) cos a = 0. 

This means that the solution (5) of (1) satisfies the case «=a of the 
boundary condition (3a). Furthermore, the solution (5) satisfies (6); so 
that, since (5) is now real-valued, 


(12 bis) y?(A) = &°(t, A) dt, where 0 << < 


0 


defines near A, a functior, y(A), satisfying 


(28) = {1 + > 0. 


| 


~~ — 


t 
t 
i 
a 
0 
d 
Pp 
( 

| 


A SEPARATION THEOREM FOR CONTINUOUS SPECTRA. 659 


In fact, (28) follows from (27) and (24), if the A, of (24), which is any 
point of S*, is called A. 

Since /(A) is regular at Ao, it is seen from (28), and from the remark 
made before (4), that the proof of (ii) and (iii) is now complete. 


Remark. The last two formula lines show that, if (5) is replaced by 
its constant multiple 


a(t) = {{1 + P(A) ]day/da}é(t, dA), 


then, besides (1) and (3a,), (2) will be satisfied by a(t) =a (t). It is 
easily realized that not only the normalizing constant, x(t) /é(t, A), is regular 
function on every A-interval J; but that the same holds.for the (L*)-solution 
x\(t) itself, if the latter is thought of as a function of A while ¢ is fixed on 
the half-line 0 =1< of (1). 


Appendix. 


As above, let S°® denote the set of those real numbers A° corresponding 
to which the differential equation (1), which is supposed to be of Grenzpunkt 
type, has a solution satisfying (2) whenever A is in a sufficiently small 
interval (A° — «,A° + e). 

With reference to a fixed choice of the boundary condition (3a), let 
S(a), P(a) and Q(a) respectively denote the spectrum, the point spectrum 
and the continuous spectrum (the latter is defined as the complement of the 
set of the greatest open A-intervals on which, in the above notations, Hellinger’s 
continuous function = da(A) is constant). Thus = Q(«) + P(«) 
+ P’(a) and S’(a) =Q(a) + P’(a), if A’ denotes the set of the cluster 
points of a A-set A. 

In connection with his proof of the fact that S’(«) is independent of 
a, Weyl [4]. pp. 251-252 points out that each of the two terms, Q(a) and 
P’(a), of S’ = S’(a) might be, but has not been proved to be, independent 
of a While this question will remain unanswered, a partial result can be 
deduced by the method used above. In fact, the following theorem will be 
proved : 

(1) The open set S° does not contain any point of any of the continuous 
spectra Q(a), — 4a Sa < 4a, which are determined by a boundary condition 
(3a) and a differential equation (1) of Grenzpunkt type. 

In view of the parallel standing of Q(«) and P’(a) in S’, it is natural 
to expect that (I) should remain true if Q(«) is replaced by P’(z). But all 


le 
ye 
e 
il 

) 
A 
n 


660 PHILIP HARTMAN AND AUREL WINTNER. 


that the method leading to (1) will supply in this regard can be formulated 
as follows: 


(11) Jn the notations of (1), the closure of any of the point spectra 
P(a), where —430r 4 < 41, ts nowhere dense on the open set S°; what 1s 
more, no subset of any of the intersections P(a)S° is dense on itself. 


In order to prove (1) and (II), let an a°, where — $x = x° < $x, be 
arbitrarily fixed, and let e—2a(t;A), where — 0 <A< o, denote that 
solution of (1) which is determined by the initial conditions 


2(0;A) = cos @°, p(0)a’(0;A) =sin 


Clearly, the solution «(t;A) thus defined differs from the above solution 
x(t, A), defined by (11), only in a constant factor. In particular, zr = x(t; .) 
satisfies (30°). Hence, if Ao, A1,- - - denotes the (finite or infinite) sequence 
which forms the point spectrum P(a°), then the respective eigenfunctions 
can be chosen to be x(t;Ao), >. Similarly, the eigendifferentials 
corresponding to the points A of Q(a°) are of the form a(t; A)d¢(A), where 
the monotone function ¢(A) depends on the choice of a. Obviously, formulae 
(12)-(17) remain applicable. 

Let A be a point of the open set S°, defined above. Then (1) has a real- 
valued, non-trivial (Z*)-solution. Let this solution, which is unique to a 
constant factor, be denoted by r= é(t;A). Since the values of € and & at 
¢t==0 can be multiplied by any real, non-vanishing constant, they can be 


normalized by 
€(0;A) cos @, p(0)é’(0;A) = cosa, 


where a= a(A) is unique mod z. Clearly, the solution «= €é(f:A) thus 
defined differs from the above solution « = &(t, A), defined by (7). only in a 
constant factor. 

Let Parseval’s relation, (15), be applied to f(t) =€&(¢;A) (instead of to 
f(t) =&(t,A), as above). To this end, it is sufficient to observe that, in 
view of the last two formula lines, the value of the Wronskian (20) at ¢=9 
now reduces to 

Wrg(0) = sin (a— @°), with a(A), 


if (22) is replaced by f(t) = €(t;A), g(t) = ®(t,») ; ef. (13), where 2(t, A) 
must be replaced by 2(t;A). Correspondingly, it is seen from the above 
calculation of the Fourier constants, (16) and (17), of f(t) =€&(¢,A) that 
the Fourier constants of f(t) =€é(t:A) are given by 


nme 


| 


n 


oOo — 


| 


| 


A SEPARATION THEOREM FOR CONTINUOUS SPECTRA. 661 


(An —A)e(An) = (w—A)dC(u) = {sin — a) 


Consequently, the explicit form of the case f(t) =€(t;A) of (15) can 
be written as 


— a) A(A) + 

0 
if sin (a@° — a) £0, where A is any point of S°, and A(A), B(A) are defined 
by 


A(A) = = 3 — 2) 


It follows that A(A) < © and B(A) < o, provided that the point A 
of S° is such that the corresponding 2 —«(A), defined above, satisfies the 
proviso sin Clearly, this proviso, «(A) (mod7z), is 
satisfied if and only if A is not in P(2°). Accordingly, A(A) < o and 
B(A) < © hold for all those points A of S® which are not in P(a@°). From 
this result, it will now be easy to deduce both (I) and (II). 


Proof of (1). Let S®°=3Ji be the decomposition of the open set S° 
into mutually disjoint, open intervals J‘ (one of which can be a half-line), 
and let J =J(A) denote that J‘ which contains a given point, A, of S°. The 
assertion of (1) is equivalent to the statement that the monotone function 
occurring in the above definition of 4(A), is constant on the -y-interval 
J. In fact, while it is true that the function ¢(y) depends on the choice of 
the a= which determines the continuous spectrum Q(a) = @Q(«@°), the 
angle a has above been chosen arbitrarily. 

Suppose that the point A of J is not in P(a°). Then A(A) < «0. Hence, 
by the definition of 1(A), and since ¢(u) is monotone, 


r 


where « > 0 is arbitrary. Consequently, if A is not in P(a°), the derivative 
dp(A)/dd exists and is 0. Since ¢(m) is monotone and continuous, and 
since the point spectrum P(a°) is enumerable (at most), it follows, first, that 
¢(A) is absolutely continuous and then, that (A) is constant on J. 


Proof of (II). If A is in S® but not in P(a@°), then B(A) < o. But 
B(A) was defined as a Borel series of the form B(A) = 3en/(A— An)”, where 


1s 
ye 
it 

} 

e 


662 PHILIP HARTMAN AND AUREL WINTNER. 


the coefficients c, = 1/y,* are positive. Since the values A, occurring in the 
denominators form the sequence which constitutes the point spectrum P(2°), 
it follows that the series Xce,/(A— A,,)* is convergent at all those points of 
the open set S° which are distinct from every An. Hence, (II) follows from 
classical properties of the A-set on which a Borel series must diverge; cf. e. g., 
[1], pp. 314-315. 


Remark. Let M(a°) denote the sequence, say 
= po(%°),: -, of those eigenvalues An =An(%°) which are contained in 
an open interval, J. If it could be proved that the corresponding Borel series, 
say 3b,/(A—pn)*, where b, > 0, cannot converge at every point of J —M 
unless the set of the cluster points of ./ is enumerable (at most), it would 
follow that the set which (II) claims to be nowhere dense is enumerable as 
well. In view of (I), this would prove that the intersection of S° and 
S’ = Q(«) + P’(a«), besides being nowhere dense, is enumerable (at most). 
It would still not follow that S° is entirely free of points of 8S’. 


THE JOHNS HopkKINs UNIVERSITY. 


REFERENCES. 


H. Hahn, Theorie der reellen Funktionen, vol. I, Berlin, 1921. 


[1] 

[2] P. Hartman and A. Wintner, “An oscillation theorem for continuous spectra,” 
Proceedings of the National Academy of Sciences, vol. 33 (1947), pp. 376-379. 

[3] E. Hellinger, “ Neue Begriindung der Theorie quadratischer Formen von unendlich- 
vielen Veriinderlichen,” Journal fiir Mathematik, vol. 136 (1909), pp. 210-271. 

[4] H. Weyl, “ Ueber gewéhnliche Differentialgleichungen mit Singularititen und die 
zugehérigen Entwicklungen  willkiirlicher Funktionen,” Mathematische 
Annalen, vol. 68 (1910), pp. 222-269. 


ADDITIVE GROUPS AND LINEAR MANIFOLDS OF TRANS. 
FORMATIONS BETWEEN BANACH SPACES.* 1 


By Bertram Yoon. 


1. Introduction. In this paper we consider the space A(X, Y)) of all 
bounded linear transformations with domain X and range in 9), where X and 
Y) are Banach spaces with complex scalars. We study A(X, 9)) in the uniform, 
strong, weak, finite and weakly finite topologies. It is our main purpose to 
obtain formulas for the closure of certain classes of additive groups and 
linear manifolds in .1(%,9)) in these topologies (except the uniform). The 
first step in this direction is the demonstration in 2 that the closure of a set 
B in A(X,¥) is contained in a set related to B (which may vary with the 
topologies). Then it is shown that, under appropriate conditions, the closure 
of additive groups and linear manifolds attains this maximum possible size. 
To a large extent the arguments are algebraic. These are based on the theory 
of rings as developed by Jacobson in [5, 6, 7]. If the notion of ideal is 
defined in a natural way (5.1), the closure in the strong and weak topologies 
of each of these groups and linear manifolds is the intersection of the smallest 
weakly closed left and right ideals containing the given set. 

In the use of Jacobson’s ideas in this connection the following difficulty 
arises. Let B be a one-fold transitive ring in A(X,X). (We are using the 
terminology of Jacobson [5]). By the results of Jacobson [5], B is dense, 
in the sense that for. any two sets Yi1,° Yn of vectors where 
the members of the first set are linearly independent (with respect to the 
complex scalars) there exists a Te B such that T(7:) = yi, +, 7, 
if and only if the centralizer of B in the set of all additive transformations 
defined on X is the set of all complex scalar multiples of the identity trans- 
formation. We have been unable to verify or refute the statement that a 
one-fold transitive ring B here has both these properties. But we do show 
that this is true if B is an algebra closed in the uniform topology. This is 
sufficient to show that any one-fold transitive algebra in A(%,%X) is dense 


in the strong and weak topologies of 1(*,%). 


* Received July 9, 1948. 
1 Presented to the American Mathematical Society, February 22, 1947, under the 


title On ideals in operator rings over Banach spaces. This is a revised version of a 


portion of the author’s dissertation, Yale University, June 1947. 


663 


the 
0). 
om 
0) 
in 
es, 
Id 
id 
). 
e 
| 
|_| 


664 BERTRAM YOOD. 


In 5 we consider ideals in A(X,%)) and certain quotient (difference) | 
spaces which can be defined in terms of subgroups. In 6 we discuss the | 
| 


notion of centralizer. 


The author wishes to express his gratitude to Professors Nelson Dunford | 
and C. E. Rickart for their suggestions and encouragement in the writing of | 


this paper. 


2. Preliminaries. Let X and 9) be two Banach spaces and let A(X, 9)) 
be the class of all bounded linear transformations with domain X and range 
in 9). Under the usual definitions of addition and multiplication by complex 
scalars, A(X,9)) is a linear space. When X =) we have the additional 
operation of multiplication, V = TU defined by the relation V(x) = T[U(2)]. 

Following the ideas of von Neumann [10] we consider the uniform, 
strong and weak topologies for 4(X,9)). The uniform topology is the 
Banach space topology obtained by defining a norm for transformations, 
| Z || | T(z)], | A strong neighborhood of T, is any set of 
the form N[7T which is composed of all transformations 7 
in A(X, such that || T (2%) — To(rx) || << k==1,- where an 
are elements of X and e > 0. A weak neighborhood is any set N[To;21,°°-.2n; 
yi*,- +,Yyn*;e] composed of transformations Te A(X,%) satisfying the 
relations | yx*[T (2%) ] — yx*[To(ax)]| <«, k=1,- - -,n, where the 2’s and 
« are described above and y,*,- - -,yn* are n arbitrary elements of 9)*, the 
conjugate space of 9). The finite topology is described by J acobson [%, p. 11]. 
A finite neighborhood of 7, is any set N[T.32%,,- * -,2n] which consists of 
all Te A(X, Y) such that T(a,) = To(ax), k=1,---,n. A weakly finite 
neighborhood of 7, is any set Yn*] of all 
Te A(X. Y) such that y.*[T (axe) = &=1,---,n. All neigh- 
borhoods are obtained by varying To, ¢, v%, yx*, nm over their domains of 
definition. 

Let B be any subset of A(X,%). Connected with B are two sets 
M(B) CX and R(B) CY defined by 


M(B) = {xe X| T(x) for every Te B} 
N(B) ={yeY|y—T (x) for some Te B, reX}. 


We use the notation {x|P} to mean the set of all 2 with the property P. 
M(B) is necessarily a closed linear manifold of X. 9(B) has the property 
that if ye M(B) and A is a scalar, then Aye N(B). However examples can 
be readily supplied to show that 9t(B) need not be a linear manifold, even 


ne 


if 
m 
ze 

( 
di 
to 
as 
T 
y" 
ne 
T 
su 
Wwe 
th 
= 
It 
if 
MN 
th 
is 


™| 


TRANSFORMATIONS BETWEEN BANACH SPACES. ~ 665 


if B is a linear manifold in A(X, )). Consider, for instance, the linear 
manifold generated by two 1—1 transformations whose ranges have only the 
zero element of Y in common. We let 9t,(B) designate the linear manifold 


in generated by N(B). 


2.1. Tuerorem. The closure of a set BC A(X, ) in the finite, strong 


"and weak topologies is contained in the following sets, respectively, 


(1) (Te A(X, 9)| M(T) D M(B), R(T) CR(B)} 
(2) (Te A(X,9)| M(T) D M(B), R(T) C R(B)} 
(3) (Te A(X, 9)| M(T) D M(B), R(T) CR, (B)}. 


The closure of B in the weakly finite topology is contained in (3). 


That the closure of B in the finite topology is contained in (1) follows 
directly from the definitions. Of all the topologies considered here, the weak 
topology is the weakest in the sense that the closure of a set in it is at least 
as large as is the closure in any of the other topologies. Suppose that 
=y for some reM(B). Let y*eY* have the property that 
y*(y) =1. Then no 7'¢B is in the weak neighborhood N[T,; 2; y*;4] of 
T,. Thus for T in the closure of B in any of the topologies, M(T) O M(B). 

If To(t) = y, ye R(B) for some xeX, then no Te B is in the strong 
neighborhood V[7,;2:p] of 7, where p is the distance from y to M(B). 
This proves the statement on the strong topology. Suppose that for some 
rex, To(x) = Yo, Ye M(B). Then there exists a linear functional y* 9)* 
such that y*(yo) =1 while y*(y) =0 for yeR,(B). No TeB is in the 
weak neighborhood N[7);2:y*;4] of T>. This proves the statements for 
the weak and weakly finite topologies. 

Next let G be an additive group of transformations in A(X, 9%). Con- 
nected with ( are a number of rings of transformations. Let 


R,(G) = {Te A(¥Y,Y)| TU eG for every Ue G} 
R.(G) = {Te A(X, X)| UT eG for every Ue G}. 


It is readily verified that the group property of G@ implies that R,(G@) and 
R.(G@) are rings of transformations in A(¥),9)) and A(X, X) respectively. 

Every T ¢ R,(G@) has the property that T(y) eM(G) if ye R(G). For 
if otherwise T(y:) yieR(G), yoeR(G), then by the definition of 
N(G) there exists UeG and such that U(x) =y,, TU(x) = 
that TU e G, contrary to the definition of R,(@). Thus the image of ®,(@) 
is contained in %,(@) for every Te R,(G@). Let 7” be the contraction of T 


he | 
ge | 
ex 
al 
1, 
he 
8, 
of 
Un 
e 
id 
1e 
of 
e 
ll 
yf 
1 


666 BERTRAM YOOD. 


to the domain ¥%,(G). Then the mapping 7’ — 7” defined for Te R,(G@) is 
a ring homomorphism under which the image of A,(@) is again a ring 
which we shall call R,(G@) is a ring of transformations from Jt, (@) 
to 91(G). 

By precisely the same reasoning, every Te takes into 
Wi(G). We consider the difference (quotient) space of X modulo Mt(G), 
X/M(G). It is well known, see [2, p. 232], that if we define the norm 
| W || of a coset W as || W || —inf |x], ve W, then X/9(G) is a Banach 
space. We use the notation 2’ to designate the coset of X/Mt(G@) which con- 
tains x. Since each Te R2(G@) vanishes on Wi(G), each Te R.(G) defines a 
transformation 7’ from X to X/Mt(G) by the rule T’(a) =[T(x)]’. Since 
ied, | 7 | =| and 7’ is bounded. Also 
as T’(x) =0’ for xe M(G), we can define T’(2’) =T’(x) as a trans- 
formation from X/Mi(G@) to itself. For each rev’, || T’(2’)|| S || ZT’ || | 
Clearly || T” || = || T’ || so that T” is bounded. The correspondence T > T” 
has the properties (7+ U0)” =T’+U", (TU)”=T”’U”. The latter 
property holds since T’U”(2’) =T”’|U(x)]’ =[TU(«)]’ = (TU)”(2’). 
Thus the correspondence T — T” takes the ring R.(G@) into a ring R,(G@) of 
bounded linear transformations from X/M(G) to itself. R,(G@) is defined 
unambiguously by G. 

As a bridge between the closure in the finite topology and that in the 


strong and weak topologies we shall use the following result. 


2.2. THrorem. Let BC A(X,9)) have the property that its closure in 
the finite topology is the set (1) of Theorem 2.1. Then its closure in the 
strong topology is the set (2) of Theorem 2.1. If N(B) is a linear manifold, 
this is also its closure in the weak topology. 


Let be in the set (2) of Theorem 2.1 and let N[T @n3e] 
be a strong neighborhood of Ty. By renumbering, if necessary, we may 
suppose that - -,., is a linearly independent subset of 2,,- - 2, which 
generates the same linear manifold. Then we can write 


vi => 
k=1 


Since N(T,) C N(B) we can select vectors yi R(B) such that || — ) 
< min for i=1,---,r where for r+1SiSn 
and 1<k<r. Now the closure, B, of B in the strong topology contains its 
closure in the finite topology. Then by hypothesis we can select T’'¢B such 
that T(z:) =yi,i=1,---,r. For these values of 4, || T(2i) — T.(2i)|| < ¢, 


th 


It 


| thi 


| al 
C 
T 
se 
B 
| fu 

| 
| 

the 
in 
th 
ar 
Ev 
is 
clo: 
( 


TRANSFORMATIONS BETWEEN BANACH SPACES. 667 


and a simple computation shows that this is also true for i=r-+1,---,n. 


} Consequently T,eB=B. Then by Theorem 2.1, B is the set (2) of 


Theorem 2.1. Again by Theorem 2.1, if 9t(B) is a linear manifold, the 
set (2) is the closure in the weak topology. 


We conclude this section with a necessary condition on strong closure. 


2.3 LemMMA. Let 2x,,:-+,2%, be n linearly independent vectors in a 
Banach space X. Then there exists a number « > 0 such that if || yi— % || <<, 
yieX, t=1,---,n, the vectors y; are linearly independent. 


Let I be the identity transformation from X to itself. Let 2;* be linear 


functionals defined on X such that = 1, We form 


‘0 the transformation 


T(x) =I (a) (yi — 21). 
i=1 
It is readily verified that T is a bounded linear transformation and that 


|7—T | ai* || | ys — zi |. 


Thus if we select e=4nA where A | ||, | Z—T| From 


| this it follows (see, for example, Gelfand [4, p. 4]) that 7 is an isomorphism. 
But T(zi) =yi, and thus the y;’s are linearly independent. 


2.4. THeoreM. Let B be any set in A(X,Q)). In order that Ty be in 
the closure of B in the strong topology tt is necessary that, whenever + ,2n 
in and +, To(an) tn Y, each linearly independent sets of vectors, 
there exists a transformation Te B such that the vectors +, T 


are linearly independent. 


Consider the « of Lemma 2.3 for the vectors -,To(%) of Y. 
Every T in the strong neighborhood N[To;.21,°--*,2n3¢] of To has the 
property that T(2,),- -,7(a,) are linearly independent. 


3. On some closure formulas. We begin with the finite topology. For 
the notions of irreducibility and density see Jacobson [5]. 

3.1. THerorem. Let G be an additive group in A(X,Y). If Ri(G@) 
is a two-fold transitive ring of transformations from N(G) to N(G) then the 
closure of G in the finite topology is the set of all T in A(X,%) such that 
MN(T) OM(G) and R(T) CR(E@). 


By Theorem 2.1, the closure of @ in the finite topology is contained in 


n 
h 
n 
| 
r n 
f 
e 
| 
y 
h 
| 
12 


668 BERTRAM YOOD. 


the set of this theorem. To show that every transformation of the set is in 
the closure in the finite topology it is sufficient to show that if 7,- - -,2, 
are any vectors in X, linearly independent modulo and if y;,- -,y, 
are any n vectors in 3((@), then there exists a transformation 7’ ¢ G such that 
T (2:1) = yi, 0. 

We proceed by induction. Let ~¢0t(G), ye N(G). Then there exists 
T eG such that T(x) 40. As y and T(z) are in N(G), by hypothesis there 
exists U e Ri(@) such that U[T(x)]=y. Since UT « G, the proof for n =1 
is complete. For n= 2 we consider 2, and x. in X any two vectors linearly 
independent modulo YWi(G) and y, and yz, in N(G). By the case n = 1 and 
the fact that G is an additive group there will exist T © G such that T(x) = y, 
i= 1,2, if we can always find a Te G such that T(2,) =0 while T(2.) 40. 
We suppose that no such T exists in G. By the two-fold transitive nature of 
Ri(G@), this implies that for every TeG, T(a,) and T(z2) are linearly 
dependent. As shown by Jacobson [5, p. 232] the mapping y = T(2x,) > T(a,) 
=V(y) is single-valued and defined everywhere in 9t(@). Furthermore, 
since G@ is an additive group and one-fold transitive, the domain 3(G) of 
this mapping is closed under addition and V is additive on 2(G). Moreover 
M(G) has the property that if ye MN (G), then Aye M(G) for each scalar d. 
Thus 9%(G) is a linear manifold. We show that V is homogeneous on 9t(/). 
For every ye R(G), there exist scalars ¢ and B not both zero such that 
ay+ BV(y) =0. If B=0, then V(y) is a multiple of y while if B -=0, 
then as 240 we have y=0O and again V(y) is a multiple of y. Given 
y =T(x,) and a scalar a ~0, we have a transformation U e (G) such 
that UT(2,) =ay. If T(x.) =0, then V(y) = V(ay) =aV(y) =0. If 
T (x2) 0, then also T(2,) and T(x.) = pT (21), 0. Since UT (2;) 
= (z,), it follows that = aT (x2). Thus V(ay) = UT(a2) = 2V(y). 
To see that V(y) is always the same multiple of y we need now consider only 
linearly independent y’s. If V(y:) =a.y:1, V(y2) == @2y2 where y, and y, are 
linearly independent in then V(yi + ys) = a3(yi + Yo) = + 
and Thus V(y) for some scalar « and _ therefore 
T =0 for every TeG. This contradicts the fact that x, and 
were chosen linearly independent modulo 2t(G@). 

The induction argument for n> 2 proceeds along the lines of the 
analogous portion of the induction used by Jacobson [5, Theorem 6]. We 
suppose that the theorem is true for n—1. As above, it is sufficient to show 
that for linearly independent modulo there exists a trans- 
formation TeG such that =: (ani) =0, T(an) 40. Com 
sider the subgroup G’ of G of all TeG which vanish on %,° °°, %-» 


eat 


l 


( 


| 

(l 
) 
ve 

i 
tr 
Ja 
} x / 
We 
an 

in 
U; 
Th 
al 

Th 


in 
Ln 
that 


ists 


here 


ivenl 


ans- 
/On- 


TRANSFORMATIONS BETWEEN BANACH SPACES. 669 


m(G’) contains M(G) and but, by induction hypothesis, no 
non-trivial linear combination of a,_, and #,. Thus 2,_, and 2, are linearly 
independent modulo 9%(G’). Then since R,(G’) Ri(G), we may use the 
first part of the proof to assert the existence of a transformation T e G’ such 
that T(an-1) =0, T(an) 40. This completes the induction. 

In the course of the proof we showed the following. 


3.2. CoroLtuaRy. Under the hypotheses of Theorem 3.1, N(G) is a 
linear manifold. 


3.3. CoroLLary. Under the hypotheses of Theorem 3.1 the closure of 
Gin the strong and weak topologies is the set of all Te A(X,Y) such that 
M(T) OM(G) and R(T) CRE). 


This follows immediately from Theorem 3.1, Corollary 3.2 and Theorem 


3.4. THroremM. Let G be an additive group in A(X,Y). If R-(G@) ts 
a two-fold transitive ring of transformations from X/M(G) to itself then the 
closure of G in the finite topology is the set of all Te A(X,Y) such that 
M(T) OM(G) and N(T) CRE). 


As in Theorem 3.1 it is sufficient to show that if 2,,- +--+, are any n 
vectors of X linearly independent modulo Mt(G) and if y:,--°+,yn are in 
then there exists a transformation such that T(7:) = yi, 
i=1,---,n. By the definition of 2(G) there exist vectors 2; in X and 
transformations eG such that U;(zi) = yi, i=1,--+,n. As shown by 
Jacobson [5, Theorem 6], R-(G@) is a dense ring of transformations from 
*/M(G) to itself. Using the notation of section 2, where R,(G) is defined, 
we see that there exist transformations V; in ?,(G@) such that Vi(ai’) =z,’ 
and V;(2;’) for #1, —=1,---,n. Let be any transformation 
in R,(G) such that 7)” = Vi, t= 1,---,n. Then UiT; eG and as each 

i=1 
U; vanishes on 
> UiTi (aj) UjT;j(2;) U;(2;) = yj. 
This completes the proof. 


We note that under the hypothesis of the theorem, 9t(G@) is necessarily 


alinear manifold. For let } A:yi be a linear combination of vectors in 2(G). 
4=1 


Then using the notation of the proof, 


=| 
arly 
and 
Yi; 
0), 
e of 
arly J 
(13) 
ore, 
of 
ver 
ra. 
nat 
uch 
If 
(y). 
only 
are 
fore 
Vy 
the 
We 
how 


670 BERTRAM YOOD. 
n° n n 
> UiTi( S = DT Aiyi- 
j=1 i=1 


Thus, as in the case of Corollary 3.3, we have the following result. 


3.5. CoroLttary. Under the hypotheses of Theorem 3.4, the closure 
of G in the strong and weak topologies is the set of all Te A(X, Y) such that 
M(T) and R(T) CRG). 

We can express the results of Corollaries 3.3 and 3.5 in another way 
by introducing the concept of annihilator. 


3.6. Derinition. Let BC A(X,9). By the left (right) annihilator 
of B we mean the set B'(B") of all Ue A(Y, X) such that UT =0 (TU =0) 
for every Te B. 


The following is readily verified. 


3.7. Lemma. B?! is the set of all Ue A(Y,X) such that U(y) =0 for 
every ye M(B) Br is the set of all Ue A(Y,X) such that N(U) C MiB), 


If we set B’ = (B')", Br' = (B’)! then, by Lemma 3.7, 
BY = {Tc A(X, Y)| R(T) CR(B)}, {Te AX, M(T) M(B)}. 
Thus we have the following result. 


3.8. THEorEM. Under the conditions of Corollary 3.3 or Corollary 3. 3, 
the closure of the additive group G in either the strong or weak topology w 
the set Gr?) 


We show next that if 9) is an infinite dimensional Banach space there 
is always an additive group @ in A(X,)) satisfying the conditions of 
Corollaries 3.3 and 3.5 such that the formula of Theorem 3.8 properly 
contains the closure of G in the finite topology. This follows from the fact 
that 9) must always contain a submanifold which is not closed. This may be 
seen as follows. There exists a sequence {yn} of vectors in 9) such that no 
finite subset is linearly dependent. Let 9 be the linear manifold in } 
generated by {y,}. If M is closed then {y,} is a Hamel basis for the Banach 
space N. But Mackey [9, p. 159] has shown that the power of the Hamel basi 
for an infinite-dimensional Banach space is at least that of the continuum. 
Hence is not closed. The additive group G = {Te A(X, 9)| R(T) CN) 
is closed in the finite topology by Theorem 3. 1, say, but has the larger closur 
{) in the strong and weak topologies. 


tl 
A 
tc 

al 

[i 

ne 

m 

ea 

wl 

spe 

Sir 

he 

spa 

| L 

by 

Fac 

for 

|W 

an is 


TRANSFORMATIONS BETWEEN BANACH SPACES. 


4, On linear manifolds closed in the uniform topology. We now relate 
the property of a linear manifold being closed in the uniform topology of 
A(X, 2)) to its having the maximum possible closure in the strong and weak 
topologies. We begin with a study of irreducible algebras in A(X,X). By 
‘ure F an algebra is meant here a set which is both a linear manifold and a ring. 
hat | The term dense is defined in 1. 


4.1. THrorem. An irreducible algebra M in A(X,X), closed in the 
way F uniform topology, is dense in A(X,%). 


Let us suppose that Y is not two-fold transitive. This gives us (Jacobson 
itor § [5, p. 232]) two linearly independent elements of X, 2, and 22, such that for 
=0) no Ze M do we have T(2,) =0 and T(2,) #40. As shown by Jacobson this 
defines an additive mapping V: T(2,) > T(a.) of X into itself which com- 
mutes with every Since YW is a linear manifold, V(Ar) =AV (zx) for 
each scalar A. Now since V7 —TV for every Te it is impossible that 
V(r) =0 for r~0, for otherwise we can choose 7 such that T(x) = ¥ 
where V(y) 0, arriving at a contradiction. Next we consider the mappings 


W; of into X defined by 
Wi(T) =T (zi) 1,2. 
3)}, 
Each W; is a homogeneous and additive transformation defined on the Banach 
space M9 (in the uniform topology) and is bounded since 
By the irreducibility of % each of these transformations has X as its range. 
Since V is 1—1, W,(T) —0 if and only if W.(7) =0. We let MC 
-_ be the closed linear manifold for which W,(7) = W.(7) =0. The quotient 
Bo: space Y/Y is a Banach space with the norm of each coset L given by 
—inf Tel. We define transformations Wi’(L) on %/M to X 
fact 
by the rule 
Wi (L) =W,(T), Te L, i=1,2. 
t no 


n YEEach W,’ is linear and bounded for 


for each T'e L, hence || W;’(Z)|| S || Wi || Z|. In fact it may be shown that 
NEW | =| Ws. By a well-known theorem of Banach [2, p. 41], Wi’ is 


sul? Han isomorphism between 2/Mt and X and W,’' is bounded. From the relation 


= W.’ it may be seen that = V(2) for all ae X. Con- 


671 
7 


672 BERTRAM YOOD. 


sequently V is continuous. Since % is irreducible, by a lemma of Bochner and 
Phillips [3, p. 412] it follows that V is a multiple of the identity trans- 
formation. Thus there is a scalar A such that V(x) Az, whence 
=0 for every Again by the irreducibility of Y, 
Y2— Ax, = 0, which contradicts the fact that 2, and 2. were chosen to be 
linearly independent. Thus %& is two-fold transitive and by the results of 
Jacobson [5, p. 233], M1 is dense. 


4.2. CoroLuary. An irreducible algebra in A(X, X) is dense in the 
strong.and weak topologies. 


The closure % of % in the uniform topology is an irreducible algebra. 
W is contained in the closure of % in the strong and weak topologies. Conse- 
quently the closure of 9 in the strong and weak topologies is the same as the 
closure of % in these topologies. By Theorem 4.1, % is dense, hence dens 
in the strong and weak topologies. 


4.3. Corotuary. If B is a linear manifold in A(X, which is closed 
in the uniform topology when considered as a linear manifold in 
A(X/M(B),Y) and if R,(B) ts irreducible, then the closure of B in the 
finite topology is the set of all Te A(X,Y) such that M(T) DO M(B) ani 
N(T) CRB). 


The assumption on closure implies that R#,(B) is closed. It is also 
readily verified that R,(B) is a linear manifold and therefore an irreducible 
algebra. The conclusion then follows from Theorems 4.1 and 3. 4. 


4.4, Corottary. Jf B is a linear manifold in with N(B) 
closed in ¥Y, B closed in the uniform topology and R,(B) irreducible, then 
the conclusion of Corollary 4.3 holds for the finite, weak, strong and weakly 
finite topologies. 


We show first that 9(B) is a linear manifold in If RN(B), 
roe M(B), then there exists a transformation such that <0. 
and thus by hypothesis such that = yi, i=1,2. Then (7, + T.) (2! 
+ and is a linear manifold and therefore a Banach space. 
By the closure of B, R:(B) is closed in the uniform topology of 4((B), N(B)). 
By Theorems 4. 1 and 3.1 the closure in the finite topology is the desired set. 
Since 9%(B) is a closed linear manifold, by Theorem 2.1 this is also the 
closure in the strong, weak and weakly finite topologies. 


m 


an 


me 


t 
( 
it 
le 
hi 
= 
is 
j 
Th: 
(5) 
the 
hav 


the 


bra. 
nse- 

the 
ense 


osed 
in 
the 
and 


also 
cible 


then 
pk ly 


) (2) 
pace, 
(B)) 
1 set. 
the 


TRANSFORMATIONS BETWEEN BANACH SPACES. 673 ; 


5. Ideals and quotient spaces in A(X,9)). By analogy we formulate 
the following definition. 


5.1. Derrnirion. An additive group G in A(X,%) ts called a left 
(right) ideal in A(X, Y) if = A(Y, VY) = A(X, X)). 

5.2. TuHeorem. If G is a left ideal in A(X, ¥)) its closure in the finite, 
strong, weak and weakly finite topologies is the set G™'. If G@ is a right 
ideal its closure in the finite topology is the set of all Te A(X, Q) such that 
N(T) CR(G). Its closure in the strong and weak topologies is the set G", 


This follows readily from the result of section 3. It is then seen that a 
left (right) ideal is weakly closed if and only if it is the right (left) anni- 
hilator of some set in A(¥),X). From this we conclude the following results. 

5.3. Corotiary. The lattice of closed linear subspaces of X(Q)) is iso- 
morphic to the lattice of left (right) annthilators in A(X, Q). 

For the case X =) this was given for right annihilators by Kakutani 
and Mackey [8, p. 53]. 

5.4, Let B be any subset of A(X,Y). Then Br’ 
is the smallest weakly closed left (right) ideal in A(X, ¥) which contains B. 

It is clear that B"’ (B"’) is a weakly closed left (right) ideal in A(X, 9) 
containing B. On the other hand, if # is a weakly closed left ideal containing 
B, then C Ett Bt and Thus Br". 

5.5. CoroLtiary. For a left ideal G in A(X,Y)) the following state- 
ments are equivalent. 


(1) G is one-fold transitive. 


(2) G is dense in the finite topology. 
(3) G is dense in the weakly finite topology. 
(4) G is dense in the strong topology. 


(5) 


~ 


is dense in the weak topology. 


That (1) implies (2) follows from Theorem 3.1. Only the implication 
(9) + (1) need be shown. Suppose that @ is not one-fold transitive. Then 
there exists 240 in X and ye such that = yo is impossible for 
Te G. By the left ideal property, T(x.) = 0 for every Te G. Let Tye A(X, 9) 
have the property that To(2o) = y 0 and suppose y* e 9)* has the property 


and 
ins- 
nee 
Y, 
be 
of 
|_| 


674 BERTRAM YOOD. 


y*(y) =1. Then no TeG is in the weak neighborhood N[7,; y*; 4} 
which is contradictory to (5). 

We consider next the left ideal of A(X,Y) where A,(M) 
= {Te¢A(X,Y)| T(x) =0 for every xe M} and Mt is a closed linear mani- 
fold in X, and A,(%) = {Te A(X, Y)| T(x) eM for every xe X} where is 
a linear manifold in 9). In particular we investigate the difference (quotient) 


spaces A,(Mt)/Ar(M) A-(M) and M Ar (N). 


5.6. THroreM. Let Mt and N be closed linear manifolds in X and 9 
respectively. The quotient space A;(M)/Ai(M) A-(M) possesses a linear 
continuous 1 —1 image which is dense in the finite topology of A(X/Wt. Y/N). 
If there exists a projection of Y on N the image is all of A(X/M,Y/M) and 
the correspondence is an isomorphism. 


Let C be the quotient space in question. Since A,(Mt) and A,(M) are 
closed in A(X, ¥)) in the uniform topology, C, with the customary definition 
of norm for quotient spaces, is a Banach Space. Let 7, and T. be two 
transformations in A;(M). They belong to the same coset in (' if and only 
if the range of T, — is in i.e, if and T.(.r) are always in the 
same coset of 9) modulo %. Now for any Te A,(M). T(x) has the same 
value over any coset of X modulo Yi. To every coset Le C there corresponds 
a transformation 7’ from X/2 to defined as follows. Let x’ 
LeC, T be a representative of Z and 2 a representative of x’ Then we take 
T’(x’) as the coset of 9/9 which contains T(2). The correspondence 1 > 7" 
is readily shown to be additive and homogeneous. Since 

|| | = int | y | Sinf | S inf 
yeT’ rear’ rea’ 


=|T Ile]. 


where 7 is any representative of L we see that || 7’ || =| that 
T’ 2 A(X/M, Y/N), and that the mapping is continuous. It is 1— 1. for 
if T’ —0 is the image of L then for every Te L, T(x) e MN for every «and 
TeA,(N), whence FL is the zero of C. 

To see that the image is dense we consider n linearly independent elements 


+ yan’ of X/M and any n elements Yn’ of Y/M. Th 
are representatives of these cosets and we choose A; (9) such 
that T(2;) =yi. i=1,- - -,n, then the image 7” of the coset containing T 


has the property that =yi’, t=1,° 
Next suppose that there exists a projection of 9) on %. Let We A(X/M. 
9/N). Consider W, defined from X to Y/R by Wy (a) = where 


i 
x 
a 
0) 


ae 
ue 


lani- 


ent) 


ul 9 


near 


are 
tion 
two 
only 
the 
ame 
nds 
mM. 
take 


hat 


for 


and 


nts 
uch 


 T 


ere 


TRANSFORMATIONS BETWEEN BANACH SPACES. 675 


vea, ve X/M. The author has shown [11] that the existence of a pro- 
jection of 2) on 9 is equivalent to the existence of a continuous linear trans- 
formation R with domain 9/9 and range Y such that the image under P# 
of each coset is an element in that coset. Since W, vanishes on i, 
RW,e€Ai(M). Let L be the coset of C which contains RW,. Since under 
our correspondence we have L — W, the correspondence has all of A(X/Mi, 
9/2) as its range. By a theorem of Banach [2, p. 41], the correspondence 


is an isomorphism. 


5.7. Tueorem. In the notation of Theorem 5.6, A(R)/AAN) 
possesses @t linear continuous 1— 1 image which is dense in the finite topology 
of ACM MN). If there exists a projection of X on Mt, the image is all of 
A(M. MN) and the correspondence is an isomorphism. 


If L is a coset of A-(9) modulo A,(M) Ar( Met), then and T. of 
A,(%) are in L if and only if T7,— 7. vanishes on Mt. Thus to L we can 
correspond a transformation 7’ which is defined on Yt and has range in Y. 
If Te L, then T” is its contraction to the domain Mt. It is readily seen 
that the correspondence is additive. and homogeneous. Furthermore since 
| 7’ |= || || for every representative TeL, || T’|| S| Thus the 
mapping is continuous. As in Theorem 5.6, we see that the image is dense 
in the finite topology of A (Mit, 

If T’e A(M,N) and if there is a projection P of X on Wt, then 
T’P?e A.(M) and T”’ is the image of the coset containing J’P under our 
correspondence. Again by Banach’s Theorem [2, p. 41] the correspondence 


is an isomorphism in this case. 


6. On the notion of centralizer. Jacobson [5] has used the notion of 
centralizer effectively in his study of the density of rings of transformations. 
We can carry this notion over to additive groups in A(X, 9)) and obtain partial 


success in the direction of Jacobson’s results for rings. 


6.1. Derinition. If U and V are additive transformations defined on 
X to X and on Y) to Y) respectively, then (U,V) is said to be in the centralizer 
of BC A(X,9) if UT =TV for every Te B. 


6.2. Tirrorem. Let G be an additive group in A(X,Y) where R,(G) 
and R.(G) are one-fold transitive. If for every (U,V) in the centralizer 
of G at least one of the transformations U and V is a multiple of the identity 


transformation, then G is two-fold transitive. 


676 BERTRAM YOOD. 


We again use the argument of Jacobson [5, Theorem 6]. G is certainly 
one-fold transitive. If G is not two-fold transitive there exist elements z, 
and .., linearly independent in X, such that T(z,) =0 implies T(x.) =9 
for TeG. This defines a mapping U: T(2,) > T(x.) of Y into itself. It 
must also be impossible to find Se R.(@) -such that S(x,) while 
40. Thus there exists a mapping V: S(2,) S(a2) of X into itself. 
U and V are additive. 

Let TeG, xexX. There exists a transformation Se R.(G) such that 
S(z,) =a. Then =TVS(2,) TS(2z.). Also UT (x2) = UTS(2,). 
But since Se R.(G), TSeG. Thus UTS(2,) =TS(2,) and UT for 
every TeG. If V isa multiple of the identity, V(x) = Az, then x, — Ar, = 0, 
which is impossible, and the same argument holds if U is a multiple of the 
identity. 

We turn our attention to the case X = Y). Here the centralizer of a set 
B is the set of all endomorphisms U of X such that UT = TU for every T ¢ B. 


6.3. THEOREM. Any non-zero ideal B (left or right) in A(X, X) has 
as its centralizer the set of multiples of the identity transformation. 


Let B be a non-zero left ideal. Arnold [1, p. 30] has shown that B 
possesses a minimal left ideal consisting of all transformations of the form 
x*(a)y where 2* is a fixed linear functional defined on X and y ranges over X. 
Let U be in the centralizer of B. Then 


(1) a*[U (x) ]y = U[a*(z)y] 


for every ve X, ye X. Since 2* £0 we can choose xe X such that = 1. 
If we then take y = Az, where A is a complex scalar, we see from (1) that 
\U(z) =U (Az) so that U is homogeneous. Also U(y) is a multiple of y 
for each y. An argument used in the proof of Theorem 3.1 shows that U 
is a multiple of the identity transformation. 

Let B be a non-zero right ideal. Arnold [1, p. 31] has shown that B 
possesses a minimal right ideal consisting of all transformations of the form 
a*(x)y where y is a fixed vector in X and x* ranges over X*. If U is in the 
centralizer of B then (1) holds for every xe X and a*eX*. Taking 2*(xr) = 1 
and z* =Az* we see from (1) U(Ay) =AU(y) which yields homogeneity 
for multiples of y. Hence (1) can be written in the form 


(2) (x) ]y = 2* (x) U(y). 
This shows that U(y) is a multiple of y, say U(y) =ay. Then from (2) 


we have 


for 
thu 


tha 
(‘01 


1. 
2. 
3. 
4, 
6. 
he 
8. 
9. 
10. 
1] 


or 


= 


1]. 


TRANSFORMATIONS BETWEEN BANACH SPACES. 


(x) — arly =0 


for every xeX. Also Therefore U(x) = az for each x and 
thus U is a multiple of the identity transformation. 


This theorem together with Jacobson’s theorem [5, Theorem 6] shows 


that a one-fold transitive left ideal is dense, a fact which is contained in 
Corollary 5.5 above. 


YALE UNIVERSITY 
AND 
CORNELL UNIVERSITY. 


BIBLIOGRAPHY 


B. H. Arnold, Rings of operators on vector spaces,” Annals of Mathematics, vol. 
45 (1944), pp. 24-29. 

S. Banach, Theorie des opérations linéaires, Warsaw, 1932. 

S. Bochner and R. S. Phillips, “‘ Absolutely convergent Fourier expansions for non- 
commutative normed rings,” Annals of Mathematics, vol. 43 (1942), pp. 
409-418. 

I. Gelfand, ‘‘ Normierte Ringe,” Recueil Mathématique (Mat. Sbornik), New Series, 
vol. 9 (1941), pp. 3-23. 

N. Jacobson, “ Structure theory of simple rings without finiteness assumptions,’ 
Transactions of the American Mathematical Society, vol. 57 (1945), pp. 
228-245. 

————, “ The radical and semi-simplicity for arbitrary rings,” American Journal 
of Mathematics, vol. 67 (1935), pp. 300-320. 

———-, * On the theory of primitive rings,” Annals of Mathematics, vol. 48 (1947), 
pp. 8-21. 

S. Kakutani and G. W. Mackey, “ Two characterizations of real Hilbert space,” 
Annals of Mathematics, vol. 45 (1944), pp. 50-58. 

G. W. Mackey, “On infinite dimensional linear spaces,” Transactions of the 
American Mathematical Society, vol. 57 (1945), pp. 155-207. 

J. von Neumann, ‘“ Zur Algebra der Funktionaloperatoren und Theorie der nor- 
malen Operatoren,” Mathematische Annalen, vol. 102 (1929), pp. 370-427. 

B. Yood, “ Transformations between Banach spaces in the uniform topology,” 

Annals of Mathematics, vol. 50 (1949), pp. 486-503. 


677 
nly 
| 
ile 
lat 
1). 
lor 
0), 
he 
set 
B. 
‘ 
B 3. 
m 
x. 
4, 
6. 
it 
y I. 
rT 
B 
1 
10. 


INVARIANTS OF INTERSECTION OF CERTAIN PAIRS 
OF CURVES IN N-DIMENSIONAL SPACE.* 


By HsIvune. 


Introduction. It is well known that there is a projective tac-invariant 
of Mehmke-Smith [5, 6, 10]? associated with two plane curves having 
ordinary contact at a nonsingular point. Several authors [3, 7, 8, 9, 11, 12] 
have extended this result to two curves C, C’ in n-dimensional space having 
the same tangent at an ordinary point 0 by imposing different conditions on 
their respective osculating linear spaces at the point 0. In particular, B. Segre 
[8] considered the most general case in which the two curves C, C’ have at 
the point 0 the same osculating linear spaces of dimensions 1,- - -,7, where 
r is any fixed integer satisfying 1S rSn—1. The purpose of this paper 
is to study the situation in which the two curves (’, C’ have distinct tangents 
at the common point 0. This investigation may be regarded as a generaliza- 
tion of some results for ordinary space given by the author in a forthcoming 


paper [4]. 
CHAPTER I. Two Curves Intersecting at an Ordinary Point 
With Distinct Osculating Linear Spaces. 


1. Derivation of Invariants. Let two curves C, C’ in n-dimensional 
projective space S, (nm =3) intersect at an ordinary point 0 with distinct 


osculating linear spaces S;, respectively. Let 
represent projective nonhomogeneous coordinates of a point in 


the space S,. If we choose the point 0 to be the origin and the .r;-axis to be 
the line of intersection of the osculating linear spaces S; and 8’,_;,;, then 
the power series expansions of the two curves in the neighborhood of the 
point 0 may be written in the form 


(1) C: (t==2,---,n), 


(2) +--- ((=1,---,n—1). 


* Received September 1, 1948; presented to the American Mathematical Society, 
September 10, 1948. 
1 Numbers in brackets refer to the bibliography at the end of the paper. 


678 


whic 


nont 


re 


For 

invar 
inch; 
lew 
the n 
to th 


the 


(3) 
whe} 
(4) 
and 
(2) 
coeft 
(5) 
Elin 
i 
(7) 
(S) 
(9) 
| 


ty, 


CURVES IN N-DIMENSIONAL SPACE. 679 


In order to find projective invariants of the two curves C, C’ at the 
pot 0, we consider the most general projective transformation of coordinates 
which leaves invariant the point 0 and all the osculating linear spaces of the 
curves C, CO’ at the point 0. This transformation is expressed in terms of the 
nonhomogeneous coordinates by the equations 


(3) Xi = (t—1,---,n), 
where 

k=1 


ind the @’s are arbitrary. The effect of this transformation on equations (1), 
(2) is to produce two other systems of equations of the same form whose 
efficients, indicated by stars, are given by the formulas 


Eliminating the «’s from equations (5), we see that the expressions 

we projective invariants associated with the point 0 of intersection of the 


curves C, C”. 


2. Metric and projective characterizations of a general invariant I;. 
For the purpose of finding a simple metric characterization of a general 
invariant 7;, we make a projective transformation which leaves the point 0 


inwchanged and carries the -,2,-axes into mutually perpendicular 
new axes. Let I, IY be the transformed curves of C, C’, and @,,° + +, be 


the nonhomogeneous Cartesian coordinates of a point in the space S, referred 
to the new orthogonal coordinate system; then the power series expansions of 
the curves T, IY in the neighborhood of the point 0 may be written in the form 


(7) (i 2,- -,#), 


— 

~ 

=> 


(8) I’: == +-- - (a 
ind the invariant J; takes the form 


(9) 1, = "9 (1 == 2,° *,n—1). 


unt 
ing 
2] 
ng 
on 
ore 
at 
re 
er 
its 
Za- 
ng 
val 
ict 
set 
in 
he 
en 
he 


680 CHUAN-CHIH HSIUNG. 


Let pi,  (i—=1,---,2—1) be respectively the i-th curvatures of the 
curves I’, IY at the point 0, then from the generalized Frenet-Serret formulas * 


(10) di pi-s/t!, bj *Gy-i/(n—i+1)! 


(t=-1,---,n—1), 
and therefore 


(t= 2,---,n—1), 


where ~ denotes that both its members are equal except for a constant factor, 
Thus we arrive at the following conclusion. 


Let pi, -,n—1) be respectively the i-th curvatures of the 
two curves C, CY at the point 0, then the quantities 


are projective invariants, and 


Let the “ point at infinity” on the aj-axis (‘=1,---,n) be denoted 
by 0;, and Tj, I’; be the projections of the curves C, C’ from the center 
02° onto the space 00,0;0, (i = 2,- -,n—1). In order 
to interpret a general invariant J; projectively, we consider a system of cones 
of order n —1 and vertex 0 in the space 00,0;0, such that the polar surfaces 
of orders 1,- - -,n—i—1, n—i-+1,- -+,n—1 of any point in the plane 
00:0, with respect to any cone of the system (the last of these polars being 
the cone itself) all pass through the line 00,. The equations of this system 


of cones can easily be written as 


(15) S + Dd 0, 
with j,k =0,1,---,i—1; j+hk=i—1, and 


h,l=0,1,---,n—1; h+l=—=n—1 


where the £’s and y’s are arbitrary. If the cones of this system have contact 
of the i(n —1)-th order with the curve T; at the point 0, then their equation 
(15) reduces to 


2 Cf. [1, p. 16] or [7, p. 396]. 


| 
al 
pe 
fo 
(1 
eq 
| 
wl 
Tl 
ve 
sis 
C0 
| 
Tes 
sat 
lin 
jec 
lin 
res 


ted 
ter 
der 
nes 
Ces 
ine 
ng 


em 


ict 


| 


on 


CURVES IN N-DIMENSIONAL SPACE. 


(16) + A jp ay 0, 


with j +, 


where the A’s are arbitrary. Among the cones (16) we can determine a 
unique one which passes through the tangent of the curve I’; at the point 9 
and with respect to which the polar surfaces of orders 1,- - -,n—2 of any 
point in the plane 00,0, all pass through the line 00;. It is easily seen that 
for this cone all A’s vanish and equation (16) then becomes 


Similarly, we may have a cone of order n—1 and vertex 0 with the 
equation 


which has contact of the (n—1)(n—i-+1)-th order with the curve I; 
at the point 0 and has the same other properties as that of the cone (17). 
The two cones (17), (18) determine a pencil of cones of order n—1 and 
vertex 0 in the space 00,0;0,. Belonging to this pencil there are two degenerate 
cones K, consisting of the plane 00,0,, counted n—1 times, and Ky» con- 
sisting of the plane 00;0,, counted n —i times, and the plane 00,0;, counted 
i—1 times. Hence the invariant I; is equal to the cross ratio of the four 
cones, Ky, Ks, (17), (18). 


CHAPTER II. Two Curves Intersecting at an Ordinary Point 
With Distinct Tangents But Certain Common 
Osculating Linear Spaces. 


3. Derivation of invariants. Let two curves (’, C’ in n-dimensional 
projective space S, = 8S’, intersect at an ordinary point 0 with tangents ¢, ?’ 
and osculating linear spaces S;, S’, of dimensions k(k = 2,---,n—1) 
respectively. In this chapter we shall consider the case in which the tangents 
t’ are distinct but - -,S,==8’,, where r is any fixed integer 
satisfying 2 = r = n—1, and there is no other relation among the osculating 
linear spaces of the two curves C, C’ at the point 0. 

We first introduce in the space S, any system of nonhomogeneous pro- 
jective coordinates 2,,° + +, With origin at the point 0, having the 2,-, 
@.-axes as the tangents ¢, ¢’ respectively, the -,x,-axes as independent 
lines lying in the common osculating linear spaces of dimensions 3,- - -,7 


respectively, and the 2;41-,* * *,@n-axes as independent lines not lying in the 


681 

or, 
the 
| 


682 CHUAN-CHIH HSIUNG. 


common osculating linear space S, but lying in the (r+ 1)-dimensional 
spaces of intersection of and S’n, and and S’,,, 
respectively. Then the power series expansions of the two curves in the 
neighborhood of the point 0 may be written in the form 


(19) C: +--- == 2,---,0n), 
r = bor,” vi 1 = 3,° 
(20) 


Let us now make a most general projective transformation of coordinates 
which leaves the point 0 and the tangents ¢, ¢’ invariant, and changes the 
other axes in such a way that the new axes have the same properties as the 
old. This transformation is expressed in terms of the nonhomogeneous co- 
ordinates by the equations 


n 
k=3 


k=t 


where PD is defined by expression (4) and the 2’s are arbitrary. The effect 
of this transformation on equations (19), (20) is to produce two other systems 
of equations of the same form whose coefficients, indicated by stars, are given 
by the formulas 


(23) 


Eliminating the a’s from equations (22), (23), we see that the expressions 
(24) = (a; /b;)*(b2/az)* (1 = 3,---,1r), 


(25) J,= (a; /b;)*b2" (n4r+1)-36 /q,3i-n—r-1 (i =r+t+1,---.n) 


are projective invariants associated with the point 0 of intersection of the 


curves C’. 


are 


by 


an 
of 
qu 
(2 
(2 
0, 
di 
(28 
(2 
(3 
No 
in 
anc 
oly 
(31 
(32 
The 
at 
cur 


the 
the 


ns 


he 


CURVES IN N-DIMENSIONAL SPACE. 683 


4, Metric and projective characterizations of general invariants I; 
and Let pi, oi (1=1,: -+,2—1) be respectively the i-th curvatures 
of the two curves (, C’ at the point 0, then it is easy to show that the 


quantities 

Hs = p2/02, (t= 4,---,r), 
(26) = (pr/or* * * On-1)* 

Ki = 


are projective invariants, and 


Let the “point at infinity” on the aj-axis (i =1,---,n) be denoted 
by 0;, and T;, I’; be the projections of the curves C, C’ from the center 
onto the space 00,0.0; It is imme- 
diately seen that the equations of the curves Tj, I’; are 


(29) Ti: + ++, 
(30) li: — dor? 


Now we project T;, I”; onto the plane 00,0; from a point V arbitrarily chosen 
in the plane 00,0.. If the coordinates of the point V satisfy equations (28) 


and 7; = 74, % — 7, 7; =0, then the two projections are easily found to be 
given by the equations = Ni. = Vi. =X, —0 and 
(31) X;=a;,X,'+- 

(32) Xi = bi (— 


The cones projecting the curves T;, I”; from the point V have contact of order 
at least i along the line OV if, and only if, the invariant of contact of the 
curves (31), (32) at the point 0 equals unit; then, and only then, the center 


13 


nal 
, 
the 
? 
tes 
ect 
ms 
ren 


684 CHUAN-CHIH HSIUNG. 


V of projection lies on the principal lines [2, p. 25] at the point 0 of the 
two curves T;, I’; Hence these principal lines have equations (28), 2; =9 
and 

(33) + = 0. 


On the other hand, let Q(Q’) be any three-point quadric of the curve T; 
(I’;) at the point 0 in the three-dimensional space 00,0.0; satisfying the 
condition that the tangent plane of Q(Q’) at the other point 7’(7) of inter. 
section of Q(Q’) with the tangent ?’(¢) passes through the point 7(7”), 
Then by means of expansions (29), (30) it is easy to show that the lines 
joining the point 0 to the other three points of intersection of the two conics 
determined by the quadrics Q, Q’ and the plane (1, 1’) have equations (28), 
x; =0 and 
(34) — = 0. 


If wu be any one of the principal lines (33) and v be any one of the three 
lines (34), then the projective invariant I; (t=3,--+,17) associated with 
the point 0 of intersection of the curves C, C’ is, except for sign, equal to the 
3i-th power of the cross ratio (tt’, uv). 


In order to find a projectively geometric characterization of a general 
invariant J; of expressions (25), we consider the projections Tj, I’; of the 
curves C, C’ from the center +0, onto the space 00,0.0; 
(i=r+1,---,n). These projections are readily found to be given by 
equations (28), (29) and 


As in 2, we consider in the space 00,0.0; a system of cones of order i — 1 and 
vertex 0, having contact of order 2(i—1) with the curve T; at the point ¢ 
and satisfying the conditions that the polar surfaces of orders 1,- - -,i—3. 
4— 1 of any point in the plane 00.0; with respect to any cone of the system 
all pass through the line 00,. The equations of this system are easily found 
to be (28) and 


(36) A; — + = 0, 
with 1=0,1,- --,t1—2; m=—1,:--,1—1; 


where the E£’s are arbitrary. Similarly, we have in the space 00,0.0; another 
system of cones of order n+ 7—i and vertex 0, having contact of order 


rve 
the 
inter- 
"(T"), 
lines 
Conies 
(28), 


three 
with 
to the 


eneral 
of the 
(),0.0; 
en by 


1 and 
oint 0 
i — 3, 
system 
found 


nother 
order 


CURVES IN N-DIMENSIONAL SPACE. 685 


2(n 4-1 —7) with the curve I’; at the point 0 and satisfying the conditions 
that the polar surfaces of orders 1,- --,n+r—i—2, n+r—i of any 
point in the plane 00,0; with respect to any cone of the system all pass 
through the line 00. The equations of this system are (28) and 


with m—1,:--,n+r—1; 
l+m=n+r—i, 


where the F’s are arbitrary. If we further impose on the cones (36), (37) 
the conditions that the planes through the line 00,, in which their inter- 
section lies, reduce to a single plane counted n+ r—2 times, then these 
planes coincide with one or other of the n + 7 — 2 planes satisfying equations 
(28) and 


Qn the other hand, if we impose on the cones (36), (37) the conditions that 
the planes through the line 002, in which their intersection lies, reduce to a 
single plane counted n-+ 7—2 times, then these planes coincide with one 
or other of the n-+-r—2 planes satisfying equations (28) and 


(39) ntr-2 a(t?) (n+r-i-1)h, (i-1) (n+r-t) n+r-2 
The line 00; and the lines of intersection of the two pencils (38), (39) of 


planes determine in the space 00,0.0; another pencil of planes, which intersect 
the plane 00,0. in n+ 7—2 lines whose equations are (28), 7; —0O and 


Let v be any one of the lines (34) and w be any one of the lines (40) in the 
plane 00,0., then the cross ratio of the four lines t, U, v. w is equal to 
(ntr-2) 1, 


MICHIGAN STATE COLLEGE, 
East LANSING, MICH. 


BIBLIOGRAPHY 


1, ©. Guichard, “Les courbes de V’espace & n dimensions,” Mémorial des Sciences 
Mathématiques, Paris, vol. 29 (1928). 


of the 
= () 


686 CHUAN-CHIH HSIUNG. 


2. G. H. Halphen, “Sur les invariants différentiels des courbes gauches,” Journal de 
VEcole Polytechnique, vol. 28 (1880). 

C. C. Hsiung, “ Projective invariants of contact of two curves in space of 4 
dimensions,” Quarterly Journal of Mathematics, Oxford Series, vol. 17 
(1946), pp. 39-45. 

, “Invariants of intersection of certain pairs of space curves,” to appear 
in the Bulletin of the American Mathematical Society. 

R. Mehmke, “ Einige Siatze iiber die rdumliche Collineation und affinitit welche sich 
auf die Kriimmung von Curven und Flichen beziehen,” Schlémilchs Zeit- 
schrift fiir Mathematik und Physik, vol. 36 (1891), pp. 56-60. 

. ——, “Uber zwei die Kriimmung von Curven und das Gauss’sche Kriimmungs- 
mass von Flichen betreffende charakteristische Eigenschaften der linearen 
Punkttransformationen,” ibid., vol. 36 (1891), pp. 206-213. 

B. Segre, “ Sugli elementi curvilinei che hanno comuni le origini ed i relativi 
spazi osculatori,” Rendiconti dei Lincei, (6), vol. 22 (1935), pp. 392-399, 

, “On taec-invariants of two curves in a projective space,” Quarterly Journal 
of Mathematics, Oxford Series, vol. 17 (1946), pp. 35-38. 

C. Segre, “ Sugli elementi curvilinei, che hanno comuni la tangente e il piano oscu- 
latore,” Rendiconti dei Lincei, (5), vol. 33 (1924), pp. 325-329. 

H. J. S. Smith, “On the focal properties of homographic figures,” Proceedings of 
the London Mathematical Society, (1), vol. 2 (1869), pp. 196-248. 

B. Su, “ Note on a theorem of B. Segre,” Science Record, Academia Sinia, vol. 1 
(1942), pp. 16-19. 

. “On certain tac-invariants of two curves in a projective space,” Quarterly 
Journal of Mathematics, Oxford Series, vol. 17 (1946), pp. 116-118. 


1¢ 
] 
1 ia 
é 
i 
he 
ti 
ti 
to 
(1 
Un 


val de 


of n 
1. 17 


ppear 


e sich 
Zeit- 


learen 


lativi 
399. 

urnal 
OSCU- 
ngs of 


vol. 1 


rterly 


A CHARACTERIZATION OF NON-ISENTROPIC IRROTATIONAL 
FLOWS.* t 


By Davin GILBARG. 


1, Introduction. The standard theory of irrotational gas flows postu- 
lates that the flows be isentropic, that is, that the entropy be constant through- 
out the region of flow. In this work, the isentropic condition is replaced by 
the weaker adiabatic hypothesis that entropy is constant along the individual 
streamlines. An apparently larger class of irrotational flows is thus obtained. 
Qur main result is the characterization in the large, as well as in the small, 
of a wide class of these irrotational flows. It is shown that all steady plane 
subsonic irrotational flows are either isentropic or are vortex flows, (Theorem 
2), and that if certain general conditions are placed on the flow boundaries, 
the same result is true for supersonic and mixed flows, (Theorems 4 and 6). 
Similar results are established for axially symmetric flows, (Theorems 3, 5). 
These results signify for the general theory that the class of adiabatic irrota- 
tional flows is not essentially wider than the isentropic flows, and that the 
isentropic assumption is generally superfluous. Hicks [1], Truesdell and 
Prim [2], and others, also have studied the relation between irrotationality 
and entropy variation, but from a different viewpoint than is taken here. 

The proofs make essential use of uniqueness theorems for the Cauchy 
initial value problem, applied in this case to the differential equation for the 
velocity potential of isentropic irrotational flow. The discussion shows that 
more general uniqueness theorems than are now available would strengthen 
some of the results obtained here. 


2. Non-isentropic irrotational flows in the small. Let the flow region 
be an arbitrary open connected set R in 3-space. A steady adiabatic irrota- 
tional flow of an ideal compressible fluid will be considered defined by func- 
tions u, v, w, p, p, 7 Which are continuously differentiable in R with respect 


to space coordinates a, y, z, and satisfy the following equations: 
(1) (V-V)V =— (grad p)/p, V = (u,v, w) 
(2) div (pV) =0 


* Received February 16, 1949. 
+ Prepared under Navy Contract No. N6onr-180, Task Order 5, with Indiana 
University. 


687 


688 DAVID GILBARG. 


(3) curl 
(4) P=f(n)p’, f, #0, y=constant ~ 0 
(5) dn/dt = V- grady = 0. 


The term ‘irrotational flow’ will be used here to mean flows of this type. 
Equations (1) and (2) are the usual equations of motion and continuity, 
V = (u,v,w) being the velocity vector of the flow, p the density (assumed 
~ 0), p the pressure; (3) is the condition of irrotationality; (4) is the 
equation of state for an ideal polytropic gas;? (5) is the statement that 
entropy 7 is constant for a particle, or in other words, that the medium is 
thermally non-conducting; this replaces the stronger isentropic condition, 
grad » = 0, which is customarily assumed in the theory of irrotational gas 
flows. 

We now consider the flow as divided into two sets: the isentropic set 4 
is the set of points at which grad y= 0; the non-isentropic set B is the set 
of points at which grady+ 0. It is clear, by continuity, that A and B are 
respectively closed and open relative to R. It will be assumed that none of 
the components (i.e. maximal connected sets) of B consists entirely of 
stagnation points. 

The vortex and helicoidal flows play a central part in what follows; 
we therefore fix our ideas with these definitions. An irrotational flow in a 
plane region G is a vortex flow if either (1) the velocity vector is constant 
in G, or (2) the streamlines are concentric circular ares, and the flow speed 
q at any point in G a distance r from the common center is g = k'/r, where t 
is constant for the flow. It follows that for vortex flows p and p are constant 
on streamlines, and either p or 7 may be arbitrary functions of the streamline. 
A helicoidal flow is a spatial irrotational flow obtained by normal super- 
position of a uniform flow on a vortex flow; that is, for suitable coordinates 
x,y,z, the velocity vector is given by grad (wz + kare tany/x), where w,/i 
are constants for the flow. The streamlines are thus concentric helices oi 
the same pitch. One sees that for helicoidal flows p and p are again constant 
on streamlines and that, if the flow is non-uniform, either p, p, or 7 may be 
an arbitrary function of r(= (a? -+ y*)*), whereas if the flow is uniform. 
either p or 7 may be an arbitrary function of the streamline, (but p is con- 
stant). The vortex flows are evidently special cases of the helicoidal flows. 
corresponding to flows for which either w or = 0. 

The vortex and helicoidal flows are irrotational and may be non-isentropt. 


1 The same results hold for more general equations of state; see concluding remarks 


he 


type. 
uity, 
med 
the 
that 
m is 
tion, 


ows; 
ina 
stant 
speed 
ere 
stant 
line. 
nates 
Ww, 
eS of 
stant 
uv be 
form. 


ows. 


narks, 


NON-ISENTROPIC IRROTATIONAL FLOWS. 689 


It will be seen that, in the plane and space, respectively, they are essentially 
the only non-isentropic flows. This is first demonstrated in the small in the 
following theorem. 


TueEoreM 1. The flow in any region of the non-isentropic set of an 
irrofational flow is helicoidal. 


Proof. From equations (1), (3), and (4), we have, 
orad + v? + w?) = — (grad p)/p = p™"[f’ grad + yf (grad p) /p]. 


The first equality implies that, in the neighborhood of any point at which 
evad p 40, + v? + w? = gq? = [q(p) |? and p= p(p). Henee, using both 
equalities, we obtain at every point of the flow, 


() = grad p X grad 7 = grad p X grad yn = grad g? X grad ». 


It follows that in the neighborhood of every point of B, p= p(n), p= p(), 
and ¢ =q(m), and, as a consequence, p, p, and qg are constant on the stream- 
line components of B. The equation of continuity (2) thus takes the simpler 
form in B, div V =0. In conjunction with (3), this establishes that the 
flow in any subregion B* C B is described by a velocity potential (2, y, z) 
which is a harmonic function, which, furthermore, satisfies the additional 
condition that g=|gradq| is constant on streamlines. Hamel [3] has 
shown that the velocity field of such a potential must be helicoidal. 
As a particular case of the preceding theorem, we have, 


CoroLtnary 1. The flow in any region of the non-isentropic set of a 
plane irrotational flow is a vortex flow. 


An independent proof of this result is obtained easily by function 
theoretic methods. 


CoroLtuary 2. The flow in any region of the non-isentropic set of an 
avially symmetric irrotational flow is a uniform flow. 


This follows from the obvious fact that the only axially symmetric 
helicoidal flows are uniform. 

The preceding result is essentially local in that it applies only to regions 
of the non-isentropic set. For a characterization of the non-isentropic irro- 
tational flows in the large we must determine the influence of the various 
parts of the flow on one another. This is provided by the following lemmas. 


LemMA 1. Any streamline of an irrotational flow lies entirely cither in 
the isentropic set or in the non-isentropic set of the flow. 


gas 

et 4 

set 

> are 

re of 

y of 


690 DAVID GILBARG. 


Proof. Let S be a streamline containing a point P in B. Suppose, in 
contradiction to the theorem, that S meets A; then, since A is closed, there 
is a segment S’ such that Pe S’C S, S’C B, and S’ has an endpoint Q in A. 
Since the flow in the component of B containing 8’ is helicoidal, we have 
that on 8S’, | grady|—k0, for some constant k. By continuity, 
| grady | =k also at Q, contradicting the assumption that S’ could have an 
endpoint in A. Therefore, S must lie entirely in B, and indeed in one 
component of B. This proves the lemma. 

The preceding lemma shows that the sets A and B are invariant under 
the flow in the sense that particles in either set remain in that set as long 
as they are in the region of flow. 

Lemaa 2. Let B be the set of boundary points of the non-tsentropic 
set; then if a point P of B is ona streamline C, C lies entirely in B, and isa 
circular helix? which extends to the boundary of the flow region (perhaps 


at infinity): q, p, and p are constant on C. 


Proof. Let C=<2x(t), y(t), z(t) be the streamline * through P = :r(t,), 


y(to), 2(to). In B there is a sequence of points P, converging to P, and let | 


Cn = 2n(t), Yn(t), 2n(t) be the (helical) streamlines through P,, the para- 
metrization of being chosen such that Pn = (Xn(to), Yn(to)s Zn(to)). By 
virtue of the continuous dependence of the sequence an(t), yn(t), 2n(¢) on the 
initial values at ¢o, the sequence converges uniformly to x(t), y(t), z(¢), in 
every closed interval (f),/), and hence the limit curve ( of the sequence of 
helices (, is itself a helix and lies in B. The functions q, p, p on C, being 
the limit of the corresponding quantities gn, Pn, pn on Cy, must evidently be 
constant on C. C cannot contain a stagnation point, therefore does not 
terminate in R. This completes the proof. 

We shall now limit the discussion to plane and axially symmetric flows. 


3. Plane and axially symmetric flows in the large. For the proof oi 
the basic Lemma 3, we require certain uniqueness theorems from the theory 
of partial differential equations. We consider in particular the quasilinear 
equations for the velocity potential of an isentropic flow. 

2 Circular helix’ is here used also for the degenerate cases of a circle and straight 
line. This convention of including the limiting forms of a curve under one heading will 
be adopted throughout, except where explicit reference is made to the limiting forms. 


’The non-constant solutions x(t), y(t), 2(t), of the system, da/dt = u(.r, 
dy/dt = v(a,y,2), dz/dt = w(a,y,2), are understood to define the streamlines of the 


flow. 


I 

‘ 

( 
| ' 

| 

7 

t 

§ 

§ 

0 

| 


se, in 
there 
in A, 
have 
nuity, 
ve an 
1 one 


ander 
long 


pre 
lisa 
‘haps 


(to), 


d let | 


yara- 

By 
1 the 
). in 
of 
eing 
v be 


not 


NON-ISENTROPIC IRROTATIONAL FLOWS. 
Plane flows: 


(6a) (1— — + (1 — bay = 03 


Avially symmetric flows: ‘ 
(6b) (1— box — + (1— buy + = 9; 
c? = Co? + (1— y) /2(d2? + dy’), Co = constant. 


On a given smooth curve C = 2(s), y(s), where s is the arc length, (y(s) > 0 
for (6b)), let Cauchy data, ¢dr—U(s), dy=—v(s), be given 
functions of s which satisfy the compatibility condition, ¢’ = uz’ + vy’. We 
consider solutions of (6a,b) having continuous second derivatives in a 
suitable neighborhood of C and on C. Then we can state the following 


uniqueness theorems : 


(—THE CASE OF ELLIPTIC DATA. Let G be a one-sided neighborhood of 
(4 and let (x,y), 6(2,y) be solutions of (6a,b) in G taking on the limit 
values @, u, v on C, and satisfying in G and on C the inequality x? + $y? < ¢? 
[i.e. be? + by? < 2c?/(1 + y)]; then O(a, y) in GS 


§S—THE CASE OF HYPERBOLIC DATA. Let C be non-characteristic with 
respect to the given data, and let u?+ v2 >c*. Let $(a,y) be a solution 
of (Ga,b) taking on the limit values $, u, v on C and satisfying the inequality 
6," + dy? > c? in a characteristic triangle region G bounded by C and two 
characteristics of (6a,b), defined with respect to o). Then if $(x,y) 
is any solution of (6a,b) in G_ satisfying the same conditions on C, 


=4(2,y) in G [5]. 


OWS. 


The corresponding theorem for the case of parabolic data on C seems 
to be incomplete at present. It is stated here in terms of the flow quantities 
under somewhat restricted hypotheses, and is proved in Section 4. 


Y—THE CASE OF PARABOLIC DATA. On C, let the flow speed be sonic, the 
tangential component of velocity 0, and the normal derivative of the speed 
#0: (it follows that on one side of C there is a neighborhood on which any 
solution of (6a,b) with values $, u, v on C is supersonic and on the other 
side a neighborhood in which the flow is subsonic). Let (x,y) be a solution 


‘It is assumed that C is part of the boundary of G, and that some neighborhood 
of every interior point of C contains no boundary points of @ other than points of C. 

* For a proof see [4], in particular, pp. 325-326 of this reference, where a simple 
proof is given for the case that C is an analytic curve, and the given data on C are 
analytic, which is the case of interest for Lemma 3. 


|_| 

oi 
ory 
ight = 
will 
the 


692 DAVID GILBARG. 


of (Ga) taking on the initial values $, u, v, on C and defined either (a) ina 
characteristic triangle G on the supersonic side of C, or (b), in a subsonic 
neighborhood G; then if $(a2,y) is any other solution defined in G and 
satisfying the same conditions on C, o(a,y) =¢(a,y) in GS 


The proof of this theorem is deferred to Section 4. It is to be noted 
that Theorem $$ does not apply to the simple situation that ( is a straight 
streamline on which the velocity is sonic, for, in this case, the normal deriva- 
tive of the speed is zero on C. This is the essential gap in the theory which 
follows. We formulate this conjectured theorem for reference purposes. 


y’. Let C be a straight line segment of a streamline and on C let the 
speed be sonic and constant. Then in a neighborhood on either side of ¢, 
the only solution of (6a,b) is the uniform sonic flow. 


This uniqueness theorem, if true, will probably hold in any neighborhood 
of C contained within the strip bounded by the two perpendiculars to ( at 
its end points. 

The following lemma will prove to be essential in the sequel. 


Lemma 3. In a plane irrotational flow, let C be either (a) « circular 
streamline segment, or (b) a closed circular streamline, or (c) an infinite 
straight streamline which does not approach the boundary of the flow. On C 
let the speed, pressure, and density be constant, and let C be curved if the 
speed is sonic on it. Then, respectively, (a) in some neighborhood containing 
every subarc of C in its interior, or (b) in a concentric annulus containing C, 
or (c) in an injinite parallel strip containing C, the flow is a vortex flow. if 
the flow is axially symmetric, and C is straight, then the flow in the respective 
neighborhoods of C is uniform. 


Proof. Consider first plane flows. (a) Through any interior point ae@ 
consider the curve orthogonal to the streamline family. Let Z be a segment 
of the curve such that the streamlines through L sweep out a full neighbor- 
hood G@ of C. There exists such a congruence of streamlines since q «9 
in the neighborhood of C. If the speed is sonic on C, ( is curved by hypo- 
thesis. and G can be chosen so small that the streamlines in it have non-zero 
curvature. 

L intersects the sets A and B in subsets A’ and B’ which are respectively 
closed and open on L. Since A’ is closed, it may contain closed subarcs of J, 


* For the case that @ is on the supersonic side of C, this theorem has been proved 
by L. Bers [6]. The method of proof is quite different from the one shown in seetion 4 
of this paper. 


f 


in He 
sonie 


and 


noted 
aight 
vhich 


the 
of C 


> 


‘hood 
at 


cular 
finite 
On 
f the 
ining 
ug C, 
if 


tive 


(le C 
ment 
hbor- 
0) 
LV po- 


ively 
of 


roved 
tion 4 


NON-ISENTROPIC IRROTATIONAL FLOWS. 693 


and forms at all its other points a nowhere dense set. If a@ is interior to a 
subare o of A’ or is an end point, then theré is a subregion of G belonging 
to 1 swept out by the congruence of streamlines through this arc, and conse- 
quently equation (6a) for the potential of an isentropic flow describes the 
flow in this region. The vortex flow is an obvious solution of (6a) satisfying 
the stated conditions on C, and also the hypotheses of the appropriate unique- 
ness theorem ©, , or $8.7 It follows that according as a@ is an interior point 
or end point of o, the vortex flow is the only solution in a one-sided neighbor- 
hood of C, or in a two-sided neighborhood containing every subare of C in 
its interior. (If the flow is sonic or supersonic on C, repeated application 
of $3 or § is generally necessary to establish uniqueness in this neighborhood. ) 
Also. if a@ is an isolated point of A’, then clearly the flows on both sides of C 
are vortex flows (belonging to B), which, combined with C, form a connected 
vortex flow. 

Suppose, then that a is neither an isolated nor an interior point of A’. 
If there are subares of A’ in every neighborhood of a, let o be such a segment, 
b the end point of o closer to a, and 0’ the farther end point. The region 
SC G@ swept out by the streamlines through o belong to the isentropic set, 
and in it, therefore, equation (6a) applies. Since b is a boundary point of B’, 
hence of B, the streamline Q3b is by Lemma 2, a circular are on which 
q: p, p are constant. Again an obvious solution of (6) which holds in § for 
these values on Q is that given by the vortex flow. If the vortex flow is 
wholly subsonic or supersonic in the interior of S, and if G@ was initially 


‘chosen sufficiently small,s we can then apply the appropriate uniqueness 


theorems ©, $, or B, to establish the uniqueness of the vortex flow in all of S. 
If, however, neither of the inequalities q? 2 yp/p holds exclusively for the 
vortex flow in the interior of S, then the proof of uniqueness is in two steps, 
first by means of © (or $) in the subsonic (or supersonic) strip between Q 
and the sonic circular streamline in S (which must exist as a consequence 
of the uniqueness proof), and then by application of theorem $ to the strip 
of S between the sonic streamline and the streamline through 6’. It has thus 
been proved that the flow through any subare oC A’ is a vortex flow. 

Since the flow in G@ through the subares of A’ and B’ consists of vortex 


"If © is an are of radius r, dq/dr = —q/r #0, so that the hypotheses of 93 are 
satisfied if the speed is sonic on C. 

8TIt suffices to choose G as follows: If q?Zvp/p on C, let @ be so small that the 
corresponding inequality holds throughout it; furthermore, if q?=yp/p in G, let M be 
an upper bound of q/(yp/p)4 in G@, and consider the region cut out by the four curves 
through the end points of C which make the acute angle are sin1/M with C and with 
the streamlines in @; let this new region be the @ of the proof. ; 


_| 


694 DAVID GILBARG. 


flow strips, it remains only to show that these belong to a single vortex flow 
in G. That the streamlines through LZ are concentric circular arcs is evident. 
Then in G, g=k(r)/r, (or ¢g=4(r) if the streamlines are all straight), 
where k(7) is the constant of the vortex flow at points a distance r from the 
common center (or from a common parallel). That k’(7r) =0 follows, since 
k’(r) is continuous and equal to zero on a dense set of values of 7 on L, 
namely, on the subares of A’ and B’. Hence k(r) is a constant and q=k/r 
in G for fixed &. The constancy of p and p on streamlines in G@ is evident. 
This completes the proof of part (a). 


Case (b) that ( is a closed circle is reduced to (a) in an evident way 
by subdividing C into several overlapping ares to which the lemma now 
applies. This determines an annulus about C in which the flow is a vortex 
flow. 

To prove the lemma for case (c), we confine our attention to a strip on 
each side of C of width less than the distance of C from the boundary of the 
flow. Consider any finite segment C’ C C, then by part (a) of the lemma, 
the flow in a neighborhood G of C’ must be uniform. As in the proof of (a), 
we can define an orthogonal segment / and on it sets A’ and B’, where now 
the streamlines through the boundary points of B’ are infinite straight lines 
on which gq, p, p are constant, and gq? ~ yp/p if G is sufficiently small. By 
paralleling the proof of part (a), one sees that it suffices to prove that, if 
q; Pp, p are constant with q?=4yp/p on an infinite straight streamline Q, 
and the flow is uniform and isentropic in a finite strip s of width 8 on one 
side of Q, then the flow is uniform in the infinite strip S of width 8 This 
follows by virtue of Lemma 1 and then € or ®. 


The proof of the lemma for the case that the flow is axially symmetric 
and C is straight is the same except for evident verbal changes. 

The restriction that C be curved in case the speed is sonic could be 
dropped if Theorem $8’ were proved. 

The proof of Lemma 3 can be greatly simplified if the data on the initial 
arc C are not sonic. This is seen as follows. Equations (6a,b) apply to the 
entire field of flow, and not merely to the isentropic regions, if ¢) is some 
appropriate function of the coordinates which is constant on streamlines. 
The generalized uniqueriess Theorems € and § for this more general form 
of (6a,b) are contained implicitly in the Carleman uniqueness theorem [10]. 
With these stronger uniqueness theorems, it is clear that the proof of Lemma 3 
is almost trivial for the case of nonsonic data on C. 


The following theorems in the large can now be proved. 


re. 


th 


ay 


or 


Ce 
m 
| If 
it 
a 
of 
th 
T 
as 
in 
th 
ec 
th 
we 
eq 
tl 
he 
se 
| 


NON-ISENTROPIC IRROTATIONAL FLOWS. 695 


THEOREM 2. A plane irrotational flow which is subsonic throughout a 
region R is either isentropic or a vortex flow in R. 


Proof. Assume the flow to be non-isentropic. Then by Theorem 1, 
Corollary 1, R contains a subregion of vortex flow B* CB. Let R be the 
maximal vortex flow region in R containing B*. We wish to show R= R. 
If not, let P be a boundary point of R. If PCB, then a neighborhood of 
it is a vortex flow region, which contains a subregion of R, and is therefore 
a continuation of the vortex flow in R, thereby contradicting the definition 
of R as a maximal region of vortex flow. Similarly, if P is a boundary of B, 
then the streamline through P is a circular arc on which q, p, p are constant. 
Therefore, by Lemma 3, a neighborhood of P is a vortex flow region, and, 
as above, this contradicts the maximal character of R. Finally, if P is an 
interior point of A, then, in a neighborhood of P, equation (6a) holds for 
the velocity potential, the flow being isentropic in this neighborhood. The 
equation is also of elliptic type by hypothesis that the flow is subsonic. Since 
the flow is a vortex flow in a subregion of this neighborhood, then, by the 
well-known theorem on the analyticity of solutions of elliptic differential 
equations [7], the flow in the entire neighborhood is of vortex type. As in 
the previous two cases, this leads to a contradiction. It follows that R, 
having no boundary points in R, is both open and closed in the connected 
set R; hence R=R. This proves the theorem. 


THEOREM 3. An axially symmetric irrotational flow which is subsonic 
throughout a region R is either isentropic or a uniform flow in R. 


The proof proceeds as in Theorem 2, except for several obvious changes 
appropriate to axially symmetric flows. 

One cannot expect to extend these theorems in all generality to mixed 
or supersonic flows, as the following simple counterexample shows. 


nt. 
t), 
the 
nee 
[, 
nt. 
ay 
OW 
ex | 
on | 
he 
a, 
es 
3y 
if 
is 
e 
i 
ul 
e 
L 
n | 
| 
3 


696 DAVID GILBARG. 


In the figure let the flow in region I be uniform and non-isentropie, 
7 =7(y) a non-constant function. In II the flow is uniform, q having the 
same constant value as in I, but », p, p are constant, and such that g? > yp/p. 
In region III, the flow is a non-uniform continuation of the flow past the 
characteristic C, the curve L being any suitable smooth curve forming a cusp 
at 0. The flow in the entire region I + II + III is thus neither isentropic 
nor a vortex flow. 

A suitable generalization of Theorems 2 and 3 to mixed and supersonic 
flows requires limitations on the heretofore arbitrary character of the flow 
regions. The limitation we adopt here is one reflecting the physical condition 
that flows exist in the large, in the sense that particles do not enter and vanish 
through the boundaries of the flow region. This restriction is made precise 
by demanding that the flow region have streamline boundaries: that is, the 
boundary will consist of piecewise smooth curves which are streamlines of the 
flow on which we require, furthermore, that the velocity be continuous. 

The above example shows that, even for these more restricted flow regions, 
extensions of Theorems 2 and 3 do not hold generally. However, appropriate 
generalizations are obtained if we exclude from the class of admissible boun- 
dary curves the following special types of curves:® (1) infinite half lines, 
(2) circular arcs, (3) closed curves, consisting in part of a circular arc, with 
cusps at both ends of the arc, (4) curves, consisting in part of a half line, 
with a cusp at the end of the half line. The boundary curve of the counter 
example flow shown in the figure is of type (4); similar examples can easily 


be constructed for flows with boundaries of the type: (1)-(3). The admissible | 
boundary curves obtained from the exclusion of types (1)-(4) will for the } 


sake of brevity be called non-cusped streamlines (although this name signifies 
a smaller class than is being considered). With this definition, we have 


THEOREM 4. A plane subsonic or supersonic irrotational flow in a region 
bounded by non-cusped streamlines is either isentropic or a vortex flow. 


Proof. The subsonic case has already been proved for general regions 
in Theorem 2; we therefore confine ourselves to supersonic flows. Assume the 
flow is non-isentropic, and let B* be a vortex flow component of B. The 
streamlines of B* must all be closed circles or infinite straight lines, for 
otherwise (Lemma 1) a streamline QC B* would terminate at the boundary 
of the flow region R, meeting there a curve which, at the same time would 
have to be a bounding streamline of B; this is possibly only if the latter 


® These curves are assumed to be maximal, i.e., they are not contained within a 
larger connected portion of the boundary. 


| 


bou 
Ca 


0 
1 
a 
ow 
a 
ar 
cl: 
m 
ve 
in 
co 
fo 
da 
is 
| 
eit 
un 
| 


on 


NON-ISENTROPIC IRROTATIONAL FLOWS. 697 


curve is a circular arc continuation of Q, and is therefore either a circular 
are or an infinite half line, contrary to our assumption concerning boundary 
curves. Let r(P) be the distance to any point P in the plane from the center 
of B*, or, if the flow in B* is uniform, from some fixed line parallel to the 
streamlines of B*. Define m and M to be such that (1) the annulus 
m<1r(P) <M lies in the maximal vortex region containing B*, and (2) 
mand MW are respectively the minimum and maximum (possibly infinite) 
such values of m and M. Such an m and M exist since B* itself is an 
annulus satisfying (1). Assume J is finite; we shall then show that the 
entire circle, or straight line, (:7 = M, is a boundary streamline of the flow. 
If this were not so, then either it lies wholly interior to R, or consists only 
partly of boundary points of R. In the former case, the curve C is a stream- 
line in R on which gq, p, p are constant and qg? > yp/p, and hence to which 
Lemma 3b or 3c can be applied. Thus there is a strip MS r(P) <M 
within which the flow is a vortex flow, thereby contradicting the choice of M 
as a maximum. On the other hand, C may consist partially of boundary 
points of R. These form a closed set C’ on C, containing generally circular 
arcs and a set of points not dense in any segment of C. By definition of the 
class of boundary curves, the boundary curve containing one of these arcs 
must have a non-cusped vertex on C. It follows from the continuity of the 
velocity on the boundary that such a vertex is a stagnation point, which is 
impossible since it is on the boundary of a vortex flow. C’ therefore does not 
contain ares. But C’ cannot contain other points of the flow boundary, 
for, as one readily sees, this would contradict the hypothesis that the boun- 
dary curve containing such a point must be a streamline on which the flow 
is continuous. It follows that either M— o, or the entire curve r—M 
bounds the flow in R. In precisely the same way one shows that the curve 
r==m is a bounding streamline. Since the flow region is connected, the 
curves 7 == m and r= J are the only boundary curves. Thus the flow is 
either isentropic or is a vortex flow in the annulus or parallel strip m< r< M. 
By making suitable verbal changes in the preceding proof we have, 


THEOREM 5. An axially symmetric subsonic or supersonic trrotational 
flow in a region bounded by non-cusped streamlines is either isentropic or a 
uniform flow. 


Theorems 4 and 5 could be stated for mixed flows provided the con- 


Lemma 3c requires that @ does not approach the boundary of R; but if the 
boundary curve is a streamline, the proof of the lemma shows that the distance between 
C and this curve must stay above some fixed value greater than 0. 


2 
D. 
1c j 
ic 
n } 
h 
e | 
S, 
8, 
h 
e, 
er 
ly 
le 
he 
es . 
ns 
he 
he | 
or 
ry 
| 


698 DAVID GILBARG. 


jectured uniqueness theorem $8’ were true. For in this case, Lemma 3 would 
hold without exception, and hence the preceding proofs would go through 
just as well with the curve C a straight line on which gq? = yp/p. However, 
if C is curved,.and g? = yp/p on it, the proof of Theorem 4 still holds without, 
the limitation to supersonic or subsonic flows. To insure that C be curved, 
it suffices that the streamlines of the non-isentropic component B* be curved. 
We thus have the following limited theorem for mixed flows. 


TueEorEeM 6. Jf a plane irrotational flow in a region bounded by non- 
cusped streamlines is non-isentropic, and if the non-tsentropic set contains 
a curved streamline, then the flow is a vortex flow. 

The proofs of preceding theorems assist in characterizing the wider class 
of flows obtained by dropping the limitation to non-cusped streamline boun- 
daries. They show that the general irrotational flow with streamline boundary 
is “ pieced together” from isentropic and vortex flows along circular ares 
or straight lines in such a way that the new boundary curves formed by the 
piecing process are curves of the exceptional set (1)-(4). The flow shown in 
the figure is a simple example of this. 


4. Uniqueness theorem for parabolic data. The proof of Theorem $ 
is achieved by the use of the well-known transformations of (6a) to the hodo- 
graph plane. Let us consider, for example, the Legendre transformation, 
u=u(x,y), v=v(r,y), &(u, v) + yv — by means of which (6a) 
is transformed into 


(7) (1 — u?/c? + 2(uv/c?) + (1 — v?/c?) = 0, 
+ $(1— y) (wu? 4+ 2”), Co = constant. 


To reduce the uniqueness theorem for (6a) to one for the above equation, 
it must first be established that any solution $(2z,y) of the former equation, 
which satisfies the hypothesis of Theorem $8 on the initial curve ( provides 
a transformation to the hodograph plane which is 1—1 in the neighborhood 
of C. This is shown by Bers [6], who proves that C is mapped in a 1—1 
way onto an are C of the circle u? + v? = c? (= 2c¢o?/1 + y) in the hodograph 
plane, and that the Jacobian of the transformation 0(u, v)/0(x,y) does not 
vanish on (, hence in a neighborhood."?. Thus any solution ¢(2,y) of (6a) 


11 This is easily seen for the particular application of theorems 9 to lemma 3, 


where C is a streamline. For one obtains, | 0(u,v)/0(a#, y)| = (wv’ — vu’)?/¢’? along | 
any sonic initial curve C, and if, in particular, C is a streamline, on which, by hypo- | 
thesis, dqg/dn ~ 0 and q = ¢’ ~ 0, then numerator and denominator of the right member | 


are evidently non-zero. 


Ww. 
x 
®, 
of 
H 


in 


wl 
be 

al 
on 
an 

in 

be 
‘tri 
sin 
fro 

to 

is 

no 
| thi 

to 
| ung 
whe 

to 
pro 
| 


NON-ISENTROPIC IRROTATIONAL FLOWS. 699 


which satisfies the prescribed conditions on C, is transformed into a solution 


| v) of (7) which on C takes on the initial values ®(s) =2u-+ — 4, 


6, = «(s), ®y=y(s). That such a solution is unique in a neighborhood 
of C is established by means of the following special case of a theorem due to 
Holmgren [8, 9]. 

Consider the equation, 


L(w) = Wee + Wey + bwy + cwe + dw, + ew =0, 


where the coefficients a, b, c, d, e are analytic functions of z and y. Let C 
be a curve which is non-characteristic (at every point) with respect to L(w), 
and let w(z,y) be a solution which takes on the values w= w, = wy, = 0 


on C, and which is continuous up to its second derivatives in a neighborhood 
and on C. Then w(2x,y) =0 in a neighborhood of C. 


Equation (7%) satisfies the conditions of Holmgren’s theorem, and thus 
@(u,v) is unique in a neighborhood of C. This implies uniqueness of (2, y) 
in a neighborhood U of C. To complete the proof of Theorem $8, it must 
be shown that if the solution ¢(z,y) of the theorem in a characteristic 


‘triangle G on the supersonic side of C, then it is unique in all of G, and 


similarly for a neighborhood G on the subsonic side. However, this follows 
from the uniqueness result already proved for U, since in U, arbitrarily close 
to C, there are non-characteristic curves C’, on which the initial data 4, dz, dy 
is uniquely determined, and to which the uniqueness theorems € or § can 
now be applied (according as C’ is on the subsonic or supersonic side: of C’) ; 
this shows the uniqueness of ¢ in an arbitrarily large subregion of G, and 


| therefore in all of G. 


It is evident that Theorem 9’ cannot be proved by this method, since 
in this case the hodograph transformation degenerates to a single point. 
Remark 1. The restriction to the polytropic gas law (4) is not essential 


to the preceding results. One sees readily that the proof remain basically 


unaltered if the polytropic law is replaced by the general equation of state, 


p=f(p,»); (fns fo, fop AO for p> 0), 


where f(p,7) is analytic in p for every y. The latter condition is necessary 
to insure the analyticity of equations (6a,b); this is a requirement for the 
proof of 9%, and also for the theorem on the analytic character of solutions 
of elliptic equations, which is quoted in the proof of Theorems 2 and 3. 


Remark 2. The preceding methods are not directly applicable to general 


| spatial flows primarily because of difficulties in the proof of the generalized 


14 


| 
gh 
er, | 
ut, 
d, 
ed. 
N- 
ASS 
ry 
res 
he 
in 
on, 
a) 
on, 
od | 
ph 
10t 
sa ) 
3, | 
ng 
er | 


700 


DAVID GILBARG. 


Lemma 3. However, this obstacle could be easily overcome with the help 
of stronger uniqueness theorems than are now available for differential 
equations in three independent variables. 


INDIANA UNIVERSITY. 


[1] 
[2] 


[3] 


REFERENCES. 


B. Hicks, “ Diabatic flow of a compressible fluid,” Quarterly of Applied Mathe- 
matics, vol. 6 (1948), pp. 221-237. 

C. Truesdell and R. Prim, Vorticity and the thermodynamic state in the flow 
of an inviscid fluid, Naval Ordnance Laboratory Memorandum 9416 (1947). 


te 


ist 


G. Hamel, “ Potentialstr6mungen mit konstanter Geschwindigkeit,” Sitzwngs- | pl 


berichte der Preussichen Akademie der Wissenschaften, 1937, pp. 5-20. 

H. Lewy, “Eindeutigkeit der Lésung der Anfangsproblems einer elliptischen 
Differentialgleichung zweiter Ordnung in zwei Verinderlichen,” Mathe- 
matische Annalen, vol. 104 (1930-31), pp. 325-329. 

R. Courant and K. Friedrichs, Supersonic flow and shock waves, Interscience, 
1948, Chapter 2. 

L. Bers, On the continuation of a potential gas flow across the sonic line, forth- 
coming NACA Technical Note. 

E. Hopf, “ Uber den funktionalen, insbesondere den analytischen Charakter der 
Lésungen elliptischer Differentialgleichungen zweiter Ordnung,” Mathe- 
matische Zeitschrift, vol. 34 (1931-32), pp. 194-233. 

J. Hadamard, Legons sur la propagation des ondes, Hermann, Paris, 1903, note I. 

E. Holmgren, “tber Systeme von linearen partieller Differentialgleichungen,” 
Ofversigt af Konigl. Svenska Vetenskaps-Akademiens Férhandlingar, 1901, 
pp. 91-103. 

T. Carleman, “Sur un probléme d’unicité pour les systémes d’équations aux 
dérivées partielles 4 deux variables indépendantes,” Arkiv fér Matematik, 
Astronomi och Fysik, vol. 26B (1938-39), no. 17. 


In 
to 


| 
| 
| 
= 
| a 
[4] | 
~ 
[5] or 
[6] re 
[7] if 
[8] 
[9] 
no? 
[10] Spa 
sem 
be 
pro’ 
whi 
| impl 
posit 


A CHARACTERIZATION OF THE NORMED VECTOR ORDERED 
SPACES OF CONTINUOUS FUNCTIONS OVER 
A COMPACT SPACE.* 


By Leopotpo NAcHBIN.+ 


Given a compact space F, the system c(/) of all real-valued continuous 


functions over FH is a normed (real) vector ordered space, or more precisely, 


a normed vector lattice. The problem of finding a characterization up to an 
isomorphism of the normed vector sublattices of some c(#) was considered 
previously by S. Kakutani, H. F. Bohnenblust, M. Krein and S. Kiein.! 


_ In this note we shall consider the problem of giving a characterization up 


to an isomorphism of the normed vector ordered sub-spaces of some c(£F).? 


Let S be a normed (real) vector space * which at the same time is an 


ordered vector space. Following the usual terminology, we shall say that 


is positive if x= 0, and negative if x= 0. We shall also say that is 
semi-positive if y = y always implies || y || = || x ||, and that x is semi-negative 


5 


if y implies | y || 


Our aim is to prove the following theorem. 


THEOREM 1. Jn order that S be isomorphic in the order, vector and 
norm sense to a normed vector ordered sub-space of c(E), for some compact 
space EL, it 1s necessary and sufficient that (1) every element of S be either 
semi-positive or semi-negative, and (2) the set of all positive elements of S 
be closed. 

Before going into the proof of this theorem, we shall first state and 


prove the following theorem concerning the extension of linear functionals, 
which may be considered as a generalization of the well known Hahn-Banach 


* Received November 18, 1948. 

+ Fellow of the U. S. A. State Department. 

1 See the bibliography at the end of this paper. 
* This case was also considered by M. Krein [8]. 


See Banach [2]. 
‘That is, 8 is a vector space which is (partially) ordered in such a way that rSy 


implies +2Sy +2 and for 


5 The origin of this terminology is the fact that in the real line the notions of 
positiveness and semi-positiveness coincide. 


701 


elp 
| 
low 
7). 
gs- | 
hen | 
he- | 
1ce, 
he- 
I, 
n,” 
01, 
ux 
ik, 
| 


702 - LEOPOLDO NACHBIN. 


theorem * to the case of ordered spaces, this last theorem being obtained as a 
particular case when the ordering relation is taken to be the discrete one. 


THEOREM 2. Let V be a vector sub-space of S, f be a linear functional 


defined over V andk=0. In order that there should exist a positive™ con- 
tinuous linear extension F of f defined over S such that || F || Sk it és 
necessary and sufficient that ve V, we S, vw imply f(v) Sk iw. 


The condition is clearly necessary because, if such an extension does exist 
and ve V,weS,v Sw, then f(v) = F(v) SF(w) S| Sk] 

To prove that the condition is sufficient let us assume that it is satisfied. 
For each ze 8, define P(x) as the greatest lower bound of || y || for all yeS 
such that It is clear that P(x), P(Ar) =AP(zx) for \=O and 
P(a, + S P(x,) + Our hypothesis that the condition is satisfied 
is equivalent to saying that f(v) Sp(v) for all ve V, where p(x) =kP(z). 
Applying a known result,* we see that there exists a linear extension F of f, 


defined over S, such that F(z) = p(x) for all reS. If x20, that is, 


then P(—2x) and therefore F(— S p(—z) —0, that is 
F(x) 20, so that F is positive. Moreover it is clear from the definition of 
P that P(x) and therefore F(z) = p(x) Sk || Replacing z 
by we get F(x) =2—k || z|| which together with F(z) Sk || x || shows 
that | F(x)| Sk || |, that is | F || 


It is not without interest to notice that, in the ordered spaces case, a linear | § 


positive continuous functional does not need to possess a linear positive con- 
tinuous extension without having to increase its norm. For instance, if S is the 
plane with its usual ordering and norm, and V is the set of all xe = (é, —é), 
then the functional f defined over V by f(a) —é is positive (because the 
ordering on V is discrete) and has norm || f || —1/V2, but any positive 
extension F of f has norm || F || = 1. 

We shall now prove Theorem 1. 

Let E be a compact space. If xec(F), then || z || 
where z,—2z\/0 and z_—2z/ 0. Let us assume that || x, || = | z- || 80 


that || If yec(£) and y=z, then y, and therefore | f(v 


ly. | = || from which it follows |= and 
semi-positive. In the same way we see that, if || z_ || = || z, || then a is semi- 
negative. From this it follows immediately that condition (1) is necessary. 
The necessity of condition (2) is clear. 


*See Banach [2], p. 55. 
7A linear functional F is called positive if F(#) 20 whenever x20. 


® See Banach [2], p. 28. 


li 
cl 

| 

t 
ev 
th 

al 

. is 
po 

an 

we 
fro 

wal 
whe 

the 

we 
and 

a li 

Fe 

i 


NORMED VECTOR ORDERED SPACES. 703 


Now let us assume that S satisfies conditions (1) and (2). Let S* be 
the dual of S, that is the vector space of all linear continuous functionals 


_ over S endowed with its natural norm and ordering. We shall also consider 


on S* its weak topology obtained by making the points of § act on S* as 
linear functionals: it is known that the unit sphere U of S* is weakly 
compact.® Moreover the set P of all positive elements of S* is clearly weakly 
closed. Putting EH =U {) P we get a compact space. 

For every xe 8S let X be the real-valued function defined over EF by 
X(f) =f(x), where fe H. It is clear that X is continuous over F and that 
the mapping z — X is a linear transformation from § into c(£). 


The mapping «—X is isometric. In fact, we have | X(f)| =| f(z)| 
S| and therefore || X |= On the other hand, 
every ze 8 is either semi-positive or semi-negative. Assume that we are in 


the first case and x 0 (the case x =0 being trivial). Let V be the set of 


' all v=Az and let f be defined over V by f(v) =Al|/z]||. If ve V, weS, 


then f(v)S|}w]. In fact, we have v=dAgr: if ASO then 
f(v) is clear, and if A>0 then v=dAr=w, that is 
tS w/Ad implies || x || S || w/A || because 2 is semi-positive and f(v) = || w | 
is still true. This being so, we may apply Theorem 2 and get a linear 
positive extension F of f defined over such that | Then Fek 
and X(F) = F(x) =f(r) =|| a]. If 2, instead of being semi-positive, is 


semi-negative we may get some Pe # with X¥(F) =—||2|]. In any case 
-| we see that || X || = || 2 ||. The final conclusion is that || X || = || z ||- 
The mapping 2 — X is one-to-one between S and its range: this follows 
from the above result. 


Finally the mapping «— JX is order-preserving in both directions. If 
+= 0 then X(f) =f(x) 20, that is ¥ =O. Assume now that Y=0. We 
want to show that « => 0 and therefore may assume 70. Put = P(—z), 
where P is the function introduced in the proof of Theorem 2. Let V be 


the set of all v = Az and let f be defined over V by f(v) =—Aés. If ve V, 
weS, vsSw then f(v) =] wi. In fact, if and AZO thea 
f(v) =—ASS=]|w|| is clear, and if A<0 then v—ArSw implies 


—2x S—w/d and therefore P(— 2) S || w/A ||, by the definition of P(—x), 


-jand f(v) S|| w |] still is true. Applying once more Theorem 2 we obtain 


a linear positive extension of f defined over with || F || 1. But then 
Fe E and X(F) = f(x) =—86 and we must have § = 0, that is, P(— x) = 0. 
This implies the existence of a sequence y,eS (n=—1,2,---) such that 


®See Bourbaki [4] and also Alaoglu [1]. 


nal 
on- | 
is | 
xist | 
vt. | 
ed, 
e 8 
fied | 
t is. 
of | 
gu 
1ea 
-on 
th 
é) 
the 
|| so 
fore 
emi 


704 LEOPOLDO NACHBIN. 


and By condition (2) we conclude that 
Theorem 1 is therefore proved. . 
An alternative form for Theorem 1 is the following: 


THEOREM 1’. In order that S be isomorphic in the order, vector and 
norm sense to a normed vector ordered sub-space of c(E), for some com- 
pact space E, it is necessary and sufficient that: (1’) if eSySz then 
lyllSiel v jz, and (2’) the set of all positive elements of S be closed, 


It is sufficient to notice that conditions (1) and (1’) are equivalent. 

The idea of using the notions of semi-positive and semi-negative elements 
is also suitable for the lattice case. Let us assume that S is a normed (real) 
vector space and at the same time a vector lattice..° Then we have: 


THEOREM 3. In order that S be isomorphic in the lattice, vector and 
norm sense to a normed vector sub-lattice of c(E), for some compact space E, 
it is necessary and sufficient that: (1) every element of S be either semi- 
positive or semi-negative, and (2) |], where |x| =2 vy 


The necessity of condition (1) was already proved, and that of condition 
(2) is clear because it is even true that || |x| || =|] z ]. 

Now assume that (1) and (2) hold good. If0 SxS ythen | y!. 
In fact, this is clear if x is semi-positive, and if x is semi-negative then 
0 implies | z || = 0, that is, and the result is still true. 

Since OS2,5]r|, we see that x, Vv | 


= I. 


If x is semi-positive, from r= we get ||. If is semi- 
negative, the relation z= z_ implies || z7 || || z_||. In any case we have 


Combining these last two results with condition (2) we see that 


| and that || 2 || =|] 2. || V |. 

The fact that 02 y implies | || || y |], and that || z || =| | z| | 
means that S is a normed lattice, which in addition has the property 
| || =|] | || |]. Incidentally we notice that, conversely, every normed 


lattice with this additional property satisfies conditions (1) and (2) of 
Theorem 3. 
Now we observe that, by a result of Bohnenblust and Kakutani, it 1s 


sufficient to prove that y=0 implies || z V y|| || |]. Since 
the condition z A y = 0 is necessary and sufficient for the existence of a ze 


10 That is, in addition to the properties stated in *, 8 is a lattice (see Birkhoff [3]). 


vo 


0 
( 
Sc 
4 An 
Pre 
Ma 
les 
at 
Scie 
bico 
| 


NORMED VECTOR ORDERED SPACES. 705 


with 2,2, in which case z=2—y and 
we see that the Bohnenblust-Kakutani condition is equivalent to || || || 
=||z,|| V || z- |], that is, to | z || — |] || V || Theorem 3 is therefore 
proved. 


An alternative form for Theorem 3 is obtained by replacing condition 
(1) by its equivalent (1’). 


UNIVERSITY OF CHICAGO. 


BIBLIOGRAPHY. 


1. L. Alaoglu, “ Weak topologies of normed linear spaces,” Annals of Mathematics, 
vol. 41 (1940), pp. 252-267. 

2. S. Banach, Théorie des opérations linéaires, Warszawa, 1932. 

3. G. Birkhoff, Lattice Theory, New York, 1940. 

4. N. Bourbaki, “Sur les espaces de Banach,” Comptes Rendus de l’Académie des 
Sciences de Paris, vol. 206 (1938), pp. 1701-1704. 

5. H. F. Bohnenblust and S. Kakutani, “Concrete representations of (Jf) -spaces,” 
Annals of Mathematics, vol. 42 (1941), pp. 1025-1028. 

6. S. Kakutani, “Weak topology, bicompact sets and the principle of duality,” 
Proceedings of the Imperial Academy of Tokyo, vol. 16 (1940), pp. 63-67. 

7. S. Kakutani, “Concrete representations of abstract (M)-spaces,” Annals of 
Mathematics, vol. 42 (1941), pp. 994-1024. 

8. M. Krein, “ Propriétés fondamentales des ensembles coniques normaux dans 
espace de Banach,” Comptes Rendus de VAcadémie des Sciences de VURSS, vol. 28 
(1940), pp. 13-17. 

9. M. Krein and S. Krein, “ On an inner characteristic of the set of all continuous 
functions defined on a bicompact Hausdorff space,” Comptes Rendus de V Académie des 
Sciences de VURSS, vol. 27 (1940), pp. 427-430. 

10. M. Krein et S. Krein, “Sur Vespace des fonctions continues définies sur un 
bicompact de Hausdorff et ses sousespaces semiordonnés,” Recueil Mathématique, vol. 
13 (1943), pp. 1-37. : 


nd 
en 
ed, 
nt. 
nts 
al) 
und 

E,| 

mi- | 

ion 

y |. 

hen 

r..| 
mi- 

ave 

hat 

| 

arty 

ned 

of 

t 

ince} 

| 


FREE SUMS OF GROUPS AND THEIR GENERALIZATIONS. 
AN ANALYSIS OF THE ASSOCIATIVE LAW.* 


By REINHOLD BAER 


Introduction. Our problem and our method of approach can best be 
described in a situation in which the problems have been solved for a long 
time: the free sum of two groups with an amalgamated subgroup. (Note 
that the composition of group elements will be called addition in spite of the 
fact that this addition need not be commutative.) In this case there are 


given two groups G and H and a pair of isomorphic subgroups G’ and H’ of | 


G and H respectively. If a is some fixed isomorphism of G’ upon H’, then 
we may identify the element z in @’ and its image xa in H’; and we are led 
to the following structure: a group G, a group H and a group S which is a 


common subgroup of G and H and which is exactly the intersection of the | 


two groups G and H. Now we form the join J of the two sets G and H 
subject to the requirement G {] H =8S; and in J we may define an addition 
in the following obvious fashion: if z and y both belong to G (or both belong 
to H), then their sum x + y is defined as in G (as in #); and their sum is 
undefined otherwise. It is clear that the sum of any two elements in JJ is 
either undefined or uniquely determined. The problem-is to imbed this 
structure into as general a group as possible, a problem that has been solved by 
O. Schreier [1] by his construction of the free sum of G and H with the 
amalgamated subgroup 

We may regain this problem in a somewhat more general form by the 
following observation: Let T be a non-vacuous subset of the group L; if 2, y 
and «+ y belong to T, then the sum of the elements x and y in T is the 
uniquely determined element x+y; and if z,y belong to T, though x+y 
does not belong to 7, then we let the sum of x and y be undefined. Thus the 


subset 7’ has been made into an additive structure, and one may ask how to 
characterize by inner properties those structures that may be imbedded into 
groups. As may be expected, such a general discussion will provide the 
proper framework within which to discuss problems of the type indicated in 


the first paragraph. 


In the first part of the present investigation (sections 1 to 5) we develop 


* Received May 2, 1948. 
706 | 


| 


( 

i 

t 

e 

i 

ti 

a 

ll 

W 

k 
| 0 

he 

su 

of 

th 

Tet 

ga 

Sj 

dif 

in 

for 

(U 


FREE SUMS OF GROUPS AND THEIR GENERALIZATIONS. 707 


a general method for attacking such problems. Here we shall be concerned 
mainly with the various possibilities for formulating the associative law of 
addition, if we assume that the sum of two elements may or may not be 
defined, but if defined is unique; and we shall investigate the effect of 
imposing these various laws on the problem of imbedding the given system 
into an additive manifold (which possesses an addition which may always be 
effected in one and only one fashion and which is associative in the cus- 
he | tomary sense). It turns out that such an imbedding may sometimes be 
effected and sometimes it is impossible; and if it is possible to effect this 
imbedding, then we may search for what one may call a “ most general ” con- 


he | taining manifold. If the term “ most general” is identified with “ freeness ” 
as recently formulated by Bates [1], then imbeddability implies imbeddability 


of | into a most general manifold; but if we identify the term “ most general ” 


with Schreier’s concept of “ freeness,” then it is easy to construct sums of 
leq | groups which are “ Bates free” but not “Schreier free.” 

In the second part we apply the results of the first five sections to the 
the _ Problem of constructing sums of groups. Here we obtain proofs of the 
y | known results (the theorems of O. Schreier [1] and H. Neumann [1]) and 
a variety of more general criteria. It seems to the author as if the results 


ion 

ng | obtained by no means exhaust the possibilities inherent in the methods which 
5 

jg | have been developed here. 
te One word concerning our methods. In most proofs for the existence of 


his | Sums of groups one attempts more or less forcefully to represent the elements 
by | of the sum by some sort of a normal form which has the disadvantage that 
the | the formal sum of two normal forms is not a normal form. Artin [2] gave 
recently a proof for the existence of the free sum of groups (without amal- 
the | gamation) in which the elements in the sum are classes of similar “ vectors.” 
Since in this case the elements are “almost in normal form” anyway, the 
the | difference between these methods is not too great in this special case. But 
Ly in our more general situation it seems to be much more advantageous to 
the | forget about normal forms altogether; and this we do. 


nto 1. Summation. The general frame work for our discussion is provided 
the | by the following concept. 
in 
DEFINITION 1. An add is a non vacuous system A of elements with one 
lop | composition, called addition a+ b, meeting the following requirement : 
lop | 


(U) If aand b are elements in A, then there exists at most one element c 
in A such that a+b—c. 


708 REINHOLD BAER. 


Thus the sum of the elements a and b in A is either undefined or else 
it is uniquely determined. 

If a,,- + -,a@, are n (equal or different) elements in the add A, then 
we define the summation G(a,,- --,dn) as a subset of A in the following 
way by complete induction. 


DEFINITION 2. (1) Jf u is an element in A, then G(u) consists of u, 
(n) Suppose that 1 <n and that u,,- + -,Un are in A. Then the element 
uin A belongs to SG(u,---,Un) tf, and only if, there exists an integer i 
with 0<i< nand elements w’ and u” in +, ui) and S(tiss,* +, Un) 


respectively such that u=w’ + wu”. 


Sometimes we shall say G4 instead of © in order to indicate that the 
summation has to be effected within the add A and not within some more 


comprehensive add. 


We note that G(a) is always a one-element set, that G(a,6) is either | 
vacuous or the one-element set consisting of a+ 6 (if this sum is defined); | 


and that S(a, b,c) may consist of 0, 1 or 2 elements. More generally we have: 
*,Un) is a finite subset of A. 
This important fact is easily verified by complete induction. 
LemMA 1. Suppose that - are elements in A, 1< Nn. 
(a) Ifa belongs to S(s:,- + +,8n), then there exists an integer i with the 
properties: 0 Cin, 8 + Sin exists in A, and a belongs to 
(b) If0<j<n, and if s;-+ 8), exists in A, then 


These two facts are easily verified by complete induction. 

If the subset S of A is not vacuous, then S may be considered as a subadd 
of A where we define addition in S in the obvious fashion. It is, however, 
quite possible that the sum of the elements s’ and s” in S exists in A, but not 
in S (namely in case s’ + s” does not belong to 8). This leads us to the 


following concept. 


DerFrinition 3. If S is a subset of A such that s’+s” belongs to 8 
whenever s’ and s” are in S and s’+ 8s” exists in A, then S is closed in A. 


} 


Note that this closure property is weaker than the subgroup property | 


(the negative integers, for example, form a closed subset of the group of 


integers). 


V 
| 
W 
b 
a 
d 
be 
tl 
0 
v 
| 


Ise 


he 


ty 
of 


FREE SUMS OF GROUPS AND THEIR GENERALIZATIONS. 709 


LEMMA 2. If are elements in the subadd S of the add A, 
then 


(a) S +, 88) 1) and 
(b) Sn) = Sn) 11 S whenever is closed in A. 


The inequality (a) is an almost immediate consequence of Definition 2; 
and the equation (b) may be verified by an easy induction argument, based 
on Lemma 1. 


2. The derived additive manifold. The following notations will prove 
very convenient in the future and will be used throughout. If Q is a set of 
elements, and if v;,- -, Un (0 <n) are elements in Q, then v = +, Un) 
is a vector of length n over Q. The element v; in Q is the i-th coordinate 
of the vector v. The system of all vectors over Q will be denoted by V(Q). 
In V(Q) we define addition by the rule 


Addition in V(Q) is unique, associative and the cancellation law holds too, 
so that V(Q) is a setni-group. (Note that our “vectors” take the place of 
what has often been termed “ words.”) 

If in particular there is given an add A, then we introduce a similarity 
between the vectors over A as follows: 


(S.1) Jf a,b,c are elements in A such that a+b=c, and if u and v 
are vectors over A, then the vectors u+ (a,6)+v and u+ (c) +0 are 
directly similar. (Here, as in similar circumstances, the vectors u and v may 
be of “length 0.”) 


This direct similarity should be interpreted symmetrically in the sense 
that direct similarity of the vectors r and s is equivalent to direct similarity 
of the vectors s and r. Direct similarity is not transitive; and thus we com- 
plement the rule (S.1) by the following rule. 


(S.2) The vectors v and w over A are similar, in symbols: v ~ w, tf there 
exists a finite number of vectors v(1),---,v(k) over A, 0<k such that 
v= v(1), v(i) and v(i+ 1) are directly similar for0 <i<kand v(k) =w. 

It is clear that similarity of vectors is reflexive, symmetric, transitive. 
Thus we may form classes of similar vectors in V(A); and we shall denote 
the class of similar vectors which contains the vector v= (v;,°°*,Un) by 


<u> <(%, Un)>- 


en | 
ont 
ry 
n) 
he | 
re | 
er 
| 
| 
| 
ot 
he 
g 


710 REINHOLD BAER. 


It is furthermore quite easy to see that sums of similar vectors are 
similar; in symbols: v~v’. and w~w’ imply Conse- 
quently it is possible to define addition of classes of similar vectors by means 


of the formula: 
(A.1) + <w> = 
Denoting by D(A), for A any add, the system of classes of similar vectors, 


subject to the definition A.1 of addition, one verifies readily the following 


facts: 


Addition of any two elements in D(A) is well defined and addition in 
D(A) is associative. 

Thus we may call D(A) the derived additive manifold (of the add A) 
in accordance with the following 


DEFINITION 1. An additive manifold is a nonvacuous system M of 
elements with one composition, called addition a+ b, meeting the following 
requirements : 


(a) Jf a and b are elements in M, then there exists one and only one 


element cin M such that a+ 
(b) a+ (b+c)=(a+ 5b) +c for a,b,c in M. 


Because of the associative law (b) we may omit parentheses in sums of 
elements in M. It should be noted that we do not require validity of the 


cancellation law or commutative law. 
To enunciate the fundamental properties of the derived manifold we 


need the concept of homomorphism. 


DEFINITION 2. If A and B are adds, if § is a single valued A to B 
mapping, and if 


a+b=c fora,b,c in A implies ah + bh = ch, 
then h is a homomorphism of A into B. 


If, for instance, the sum of elements in A never exists, then every single 
valued mapping of A into some add is a homomorphism. 


TueoreM 1. If A is an add, then the mapping 
(N) a*=<(a)> forainA 


has the following properties: 


su 


an 


ho 
for 


Th 
of 
fol 


mii 


| fro 


( 
( 
t 
(( 
‘CS 
by 
| 


in 


ne 


zle 


FREE SUMS OF GROUPS AND THEIR GENERALIZATIONS. 711 


(a) Mapping x in A upon a* in D(A) is a homomorphism of the add A 
into the derived additive manifold D(A). 


(b) Every element in D(A) is the sum of elements in the subset A* (of all 
the elements x* for x in A). 


(c) If hts a homomorphism of A into the additive manifold M, then there 
exists one and only one homomorphism f£ of D(A) into M such that «*€ = 2h 
for every xin A. 

Proof. It is clear that a single valued mapping of A into D(A) is effected 
by mapping z in A onto z* = <(x)> in D(A). If a,b,c are elements in A 
such that a+ b =c, then it follows from the preceding definitions that 


a* + b* = <(a)> + <(b)> =< (a) + (b)> = <(a,b)> = <(c)> = 
This shows the validity of (a). 


If y is an element in D(A), then there exist elements a,,- - -.a@, in A 
such that 


and this proves the validity of (b). 
If h is a homomorphism of A into the additive manifold M, and if f is a 
homomorphism of D(A) into M such that 2*f = zh for x in A, then we have, 


for a,,° - *,@, in A, as before 


Thus f is uniquely determined by b, if it exists at all. To prove the existence 
of at least one homomorphism f, meeting our requirements, we proceed as 
follows : 


If a,,° are elements in A, then a,h -+- - -+ is a well deter- 
mined element in the additive manifold M@. Thus we may let 


If a,b,c are elements in A such that a+ bc, then ah + bh —ch since 
h is a homomorphism. If w and v are vectors over A, then it follows now 


| from the validity of the associative law in M that 


[w+ (a; b) + v]h = ub + (a, + vh = ub + ab + bb + vb 
= uh + ch + vh = [u + (c) + 


are 

se- 

rs, 

ng 

of | 

of 

he 

ve 


712 REINHOLD BAER. 


Hence } maps directly similar vectors upon the same element. Consequently 
any two similar vectors are mapped by § upon the same element in M. Hence 
a single valued mapping f of D(A) into M is defined by the rule: 


<v>f = vh for v in V(A). 


From <v>f + <w>f = vh + wh = (v + w)h = <(v + w) dF, by the associative 
law in M, we deduce finally that £ is a homomorphism of D(A) into M. 
If x is an element in A, then we have 2*f = <(r)>£ = (x)h = 2h; and this 
completes the proof. 

The homomorphism of A into D(A) which is defined by Theorem 1, 
(N) will be referred to as the natural homomorphism of the add A into its 
derived additive manifold D(A). 


In order to show that the derived manifold and the natural homo- 
morphism are essentially uniquely determined by the properties (b) and 
(c) of Theorem 1, we need the concept of isomorphism. 


DEFINITION 3. An isomorphism of the add A upon the add B is u 
correspondence a meeting the following requirements: 


(a) ais a one to one mapping of A upon B= Aa; 


(b) if a, b, c are elements in A, then the validity of a+ b=c 1s necessary 
and sufficient for the validity of aa + ba = ca. 


Thus if a is an isomorphism of A upon B, then there exists the inverse 
correspondence a? and is an isomorphism of B upon A. It should in 
particular be noted that a homomorphism of A into B which is at the same 
time a one-to-one mapping of A upon B need not be an isomorphism. If, 
for instance, the adds A and B consist of the same elements, though using 
different definitions of additions, then it is quite possible that the identity 
mapping is a homomorphism of A upon B without being an isomorphism 


of A upon B. 


THEOREM 2. Suppose that the homomorphism r of the add A into the 
additive manifold M has the following properties. 


(a) very element in M is the sum of elements in At. 


(b) If h is a homomorphism of A into some additive manifold N, then 
there exists a homomorphism £ of M into N such that h = tf. 


Then there exists an isomorphism » of M upon D(A) such that x* = arb 


forxin A. 


of 
ho 
int 


If 

the 
by 

fro 
sho 
seq 
is 1 


pro 
law 
pro) 


add 


cont 


|| 
and 
of le 
asso 
hom 
corre 
belor 
of v. 


FREE SUMS OF GROUPS AND THEIR GENERALIZATIONS. 713 


Proof. We apply first condition (b) upon the natural homomorphism 
of A into D(A). Hence there exists a homomorphism b of M into D(A) 
such that 2* arb for x in A. Next we apply Theorem 1, (c) upon the 
homomorphism r of A into M. Thus there exists a homomorphism f of D(A) 
into M such that 2*f— zr for z in A. We have consequently 


= and ar = x*f arbvf for every x in A. 


If y is an element in D(A), then it follows from Theorem 1, (b) that y is 
the sum of elements in A*. But every element x* in A* is left invariant 
by fy; and thus it follows that y= yfv for y in D(A). Likewise we deduce 
from our hypothesis (a) that z= zvf for every z in M. Thus we have 
shown that fp 1 and pf 1. But f and b are homorphisms and con- 
sequently they are reciprocal isomorphisms between D(A) and M. Hence v 
is the desired isomorphism of M upon D(A). 


3. The associative laws. We are now going to enunciate a variety of 
properties of adds which would be equivalent to the customary associative 
law, if the sum of any two elements in the add would always exist. But as 
properties of adds they can be shown not to be equivalent. 

If v is a vector over the add A, then v= (v,,-- :,vn); and thus the 
addition Gv = S(v1,- - -, vn) is well defined (see section 1). 


THE First on WEAK AssocrATIVE Law is satisfied by the add A, if Gv 
contains, for every vector v over A, at most one element. , 


We note that Sv contains one and only one element, if v has length 1; 
and that Sv contains at most one element, if v is of length 2. For vectors 
of length 3 the weak associative law is rather similar in effect to the customary 
associative law. 


THE Seconp AssociaTIVE Law is satisfied by the add A, if the natural 
homomorphism of A into its derived additive manifold D(A) 1s a one-to-one 


correspondence. 
The following fact will be useful. 


Lemma 1. If v is a vector over the add A, and if the element u in A 
belongs to Gv, then (u) ~v. 


This is easily verified by complete induction with respect to the length 
of v. (Use 1, Lemma 1, (a).) 


The first associative law is a consequence of the second associative law. 


ly 
ye 
is 
1, 
ts 
0- 
1d 
u 
ry 
se 
in 
e 
f, 
ty 
he 
en 
rv | 


714 REINHOLD BAER. 


Proof. If the elements a and b belong to Gv, then it follows from 
Lemma 1 that (a) ~v~ (b). Hence a* = b*; and a= b is a consequence 
of the fact that the natural homomorphism is one to one. Hence Sv contains 
at most one element. 

In general, the second associative law is not a consequence of the first 
one. However, we shall show (in section 4) that they are equivalent in the 
cases of greatest interest. 


THE THIRD OR INTERMEDIATE ASSOCIATIVE LAW is satisfied by the add 
A, if the natural homomorphism of A into its derived additive manifold is 


an isomorphism. 


It is obvious that the second associative law is a consequence of the 
third one: and their non-equivalence may be verified easily [see 6, Example 2]. 
We may decompose the third associative law into two properties as follows: 


(III.1) (a)~(b) tf, and only if, a=b. 
(III.2) (a,b) ~(c) if, and only if,a+b=c. 


(III. 1) is equivalent to the second associative law, since (a) ~ (b) if, 
and only if, a* = (III.1) and (III. 2) together are equivalent with the 
third associative law, since (a,b) ~ (c) if, and only if, a* + b* = c* (see 
also 2, Definition 3). 


THE FourTH or Stronc ASssocIATIve Law is satisfied by the add A, tf 
v—~w implies always Sv = Sw. 


The third associative law is a consequence of the fourth associative law. 


Proof. If (a) ~(b), then we infer S(a) = ©(b) from the strong 
associative law. This implies a= b, since G(x) consists of the element 
alone. Hence (III.1) is true. If (a,b) ~ (c), then we infer S(a, b) = S(c) 
from the strong associative law. Since G(c) consists of ¢ only, it follows 
that c belongs to G(a,b); and this implies a+ b=—c. Hence (III.2) is 
true too. 

The converse of the preceding statement is not true [6, Example 3]; 
and it appears that there exists a considerable gap between the third and the 
fourth associative law. 

In what follows we shall refer to the various associative laws either 


oy their names or by roman numerals. We remark that I seems to be the, 
most natural generalization of the usual form of the associative law, as| 


usually formulated. The importance of III is fairly apparent; and, as 4 


th 


na 
ho: 


an 


m 
of 
tl 
us 
(| 
is 
ho 
sat 
| 
sat 
of 
Sin 
He 
iso 
of 
Le 
mo 
sup 
vec 
vect 
in . 


he 
ee 


FREE SUMS OF GROUPS AND THEIR GENERALIZATIONS. 715 


matter of fact, IV will be by far the most important of all. The importance 
of III and IV will be elucidated somewhat in section 5. Some indication of 
the importance of Properties (II) and (III) is contained in the following 
useful proposition which is a generalization of a result due to H. Neumann 
({1], p. 598, Theorem). 


LemMa 2. The natural homomorphism of the add A into D(A) 1s an 
isomorphism (is one to one) if there exists an isomorphism (a one-to-one 
homomorphism) of A into some additive manifold. 


Proof. If § is a homomorphism of A into the additive manifold M, 
then we deduce from 2, Theorem 1, (c) the existence of a homomorphism f 
of D(A) into M such that x*f ah for z in A. Now it is clear that the 
natural homomorphism is one to one, if § is one to one; and the natural 
homomorphism is an isomorphism whenever § is an isomorphism. 


CorotLary. The add A may be imbedded into an additive manifold +f, 
and only if, the third associative law is satisfied by A. 


This is a fairly immediate consequence of Lemma 2 and the existence 
of the derived manifold D(A). 


Proposition 1. (a) Jf the first (or second or third) associative law is 
satisfied by the add A, then it is also satisfied by the subadds of A. 
(b) Jf the strong associative law is satisfied by the add A, then it 1s also 
satisfied by the closed subadds of A. 


Proof. If the weak associative law is satisfied by A, if S is a subadd 
of A, and if wv is a vector over S, then Gsv = Gav [1, Lemma 2, (a) ]. 
Since the latter summation contains at most one element, so does the former. 
Hence the weak associative law is satisfied by S. 

If the natural homomorphism of A into D(A) is one to one (is an 
isomorphism), then it induces a one-to-one homomorphism (an isomorphism) 
of the subadd S of A into the additive manifold M(A). It follows from 
Lemma 2 that the natural homomorphism of S is one to one (is an iso- 
morphism) ; and this completes the proof of (a). 

Assume finally the validity of the strong associative law in A; and 
suppose that the subadd S of A is closed in A. If v and w are similar 
vectors over S, ther. it follows from V(S) = V(A) that v and w are similar 
vectors over A. Hence Giv = Saw, since the strong associative law holds 


in A. Since S is closed in A, we deduce 


Sev = Sav S = Sgw 


15 


m 
ce 
8 
st 
he 
ld 
e 
8: 
if 
1g 
¢) 
VS 
is 
|; 
er | 
| 
| 


716 REINHOLD BAER. 


from 1, Lemma 2, (b); and this shows the validity of the strong associative 
law in 8. 

Remark 1. It is not difficult to construct subadds of groups which do 
not satisfy the strong associative law (see, for instance, 6, Example 3). This 
shows that Proposition 1, (b) would fail to be true, if we omitted the require- 
ment that § be closed in A. 


Proposition 2. The associative law (X), for.X =I, II, III, IV, és 
satisfied by the add A, if every finite subset of A is contained in a subadd 
of A with Property (X). 


Proof. Assume that every finite subset of A is contained in a subadd 
of A which satisfies the strong associative law. Suppose that v = (v,,- - -, Un) 
and that the vector w of length n — 1 is directly similar to v. Form a finite 
subset F of A as follows: F contains every 2;; it contains vj; + v4, if it 
exists in A; it contains + Vis.) + and (0; + if these 
sums exist in A; and so on. It is clear that F is finite, that v and w are 
vectors over F, and that Gav = Grv and Gaw = Grw. But F is, by hypo- 
thesis, contained in a subadd § of A in which the strong associative law is 
satisfied. Now one sees that = Srv = Sgv =Sgw = Srw = Saw. Hence 
directly similar vectors in A have the same summation in A; and conse- 
- quently any two similar vectors in A have the same summation in A. Thus 
the strong associative law is valid in A. The other contentions of our 
proposition are verified in a similar fashion. 


DeEFINITION. The vector v= over A is contractable, tf 
1 <n and if there exists an integer i such that 0<1< nand such that the 
sum 44+ viz, exists in A. 


Using the terminology of section 2 we may also say that the vector ¥ 
of length n is contractable, if 1 < n and if there exists a vector w of length 
n—1 which is directly similar to v. 


THEOREM 1. The following properties of the add A imply each other. 
(i) If aisin A and v in V(A), and if (a) ~», then a belongs to Sv. 


(ii) If a, a,,- - are elements in A such that (a) ~ -,@n), then 
n=1 and or else 1 <n and the vector (a;,° contractable. 


(iii) If v and w are similar vectors over A, and if Gv is not vacuous, then 
Sw consists of one and only one element. 


( 

( 

e 

. a 
if 

( 

a 
(¢ 

ve 
su 

v 
ler 

fo 
fr¢ 

(i 

6. 

(n 

‘ be 

wh 

co 

Le 

dire 

tha 

and 

| this 


FREE SUMS OF GROUPS AND THEIR GENERALIZATIONS. 717 


(iv) The weak associative law holds in A; and if v and w are directly 
similar vectors of length n—1 and n respectively such that Gw is not 
vacuous, then Gv is not vacuous. 


(v) The strong associative law holds in A. 


Proof. Assume the validity of (i); and suppose that a, - -, are 
elements in A such that (a) ~ (@,°-+-+,@n) =v. It follows from (i) that 
abelongs to Sv. If n—1, then Sv consists of a, alone so that a —a,; and 
if 1 < n, then we infer the contractability of v from 1, Lemma 1, (a). Thus 
(ii) is a consequence of (i). 

Assume next the validity of (ii). Suppose that r is a vector over A 
and that a and b are elements in Gr. It follows from Lemma 1 that 
(a) ~r~ (b); and hence it follows from (ii) that a6. This proves the 
validity of the weak associative law in A. Consider next a vector v over A 
such that Gv is not vacuous. If there existed among the vectors similar to 
v one with vacuous summation, then there would exist a vector w of minimal 
length m such that v ~ w and Sw is vacuous. Clearly 1 < m, since 
for z in A, consists of «. There exists an element wu in Gv; and it follows 
from Lemma 1 that (w) ~v. Consequently (w) ~w; and we infer from 
(ii) the contractability of w. Hence there exists a vector s of length m—1 
which is directly similar to w; and it follows from 1, Lemma 1, (b) that 
€s= Sw. Since v~w~s, and since the length of s is less than the 
(minimal) length of w, it follows that Gs is not vacuous. Hence Gw cannot 
be vacuous either, a contradiction which proves ‘that Gt is not vacuous, 
whenever ¢~v (and Gv is not vacuous). Since we have already verified 
the validity of the weak associative law, we have shown that (iil) is a 
consequence of (ii). 

(iv) is a special case of (iii) (and hence a consequence of (iil) ). 

Assume now the validity of (iv). Suppose that v and w are directly 
similar vectors of length n—1 and n respectively. Then it follows from 1, 
Lemma 1, (b) that Gu=Gw. It follows from (iv) that either Gv and 
Sw are both vacuous or they contain both exactly one element. In either 
case we may infer Gv —Gw from Gv=Gw. Thus the summations of 
directly similar vectors are equal; and it follows (by complete induction) 
that the summations of similar vectors are equal. Thus the strong asso- 
ciative law holds in A, proving that (v) is a consequence of (iv). 

Assume finally the validity of the strong associative law. If a is in A 
and v in V(A), and if (a) ~»v, then S(a) = Gv. But S(a) consists of a 
alone, showing that a belongs to Gv. Thus (i) is a consequence of (v) ; and 
this completes the proof. 


do 
re- 
18 
dd 
dd 
Un) 
ite 
it 
ese 
are 
is 
ce 
se- 
us 
yur 
tf 
the 
v 
rth 
ven 
Len 


718 REINHOLD BAER. 


Remark 2. Condition (ii) may be restated in the following very useful 
form which involves the natural homomorphism of A into D(A). 


(ii’) The natural homomorphism of A into D(A) is one to one: and if 
are elements in A such that 1 < n and such that the sums a; + a;,,, 
for0 <i< n, do not exist in A, then a*, +-- - + a*, does not belong to A*. 


The equivalence of the properties (ii) and (ii’) is easily deduced from 
the fact that 
(a) ~ (a,,° and a* +---+a*, 


are equivalent properties of the n+ 1 elements a,4a,,° * +, in A. 


4. Self-reflexive adds. This important class of adds is characterized 
by the validity of the following postulates. 
(SR.1) There exists an element 0 in A such that r=x+0=—0+4 2 for 
every x in A. 

It is clear that the null-element is uniquely determined, if it exists. 


It may happen that a subadd of an add with 0 may posess a null element, 
not 0. We shall therefore usually restrict our attention to those subadds 


of A which contain the element 0. 
(SR.2) To every element x in A there exists an element y in A such that 
r+y=0. 
(SR.3) «+y=0 and x+2=—0 imply y= 
If (SR. 2) and (SR.3) are satisfied, then we may denote the uniquely 


determined solution of the equation a+2—0 by r——a. Instead of 
(—a) +b and c+ (—a) we shall write —a-+ b and c—a. 


(SR.4) x«+y=0 tf, and only if, y+2=0. 


Together with the preceding postulates this implies —(—2) —@. 

If these four postulates are satisfied by the add A, then we shall term 
the add A self-reflexive. Sometimes it will be convenient to be able to use 
the following rule which will be satisfied in all important situations. 


(SR.5) If a+b=0, and if the sum b +c ewists in A, then a+ (b+ 


—c; if the sum r+s exists in A, and if s+t=0, then (r+s) +t=". 
An add A which satisfies all the rules (SR. 1) to (SR.5) shall be termed 


self-reflexive in the strict sense. 


tl 


t 
el 
ele 
It 
ad 
(b 
|: 
y* 
and 
oth 
0-fo 
[whe 
In A 
Inste 
3, D 
0al 
(a) 


ized 


for 


ists. 
ent, 


dds 


hat 


1ely 
of 


FREE SUMS OF GROUPS AND THEIR GENERALIZATIONS. 719 


Proposition 1. Jf the add A has properties (SR.1) and (SR. 2), then 
the derived additive manifold is a group. 


Proof. Addition in D(A) is always defined and associative. It is clear 
that <(0)> is the null-element of the additive manifold D(A). If d is an 
element in D(A), then d= <(d,,- - -,dn)> with d; in A. Hence there exist 
elements d’; in A such that d; + ad’; = 0 for every Let d’ = +, d’1)>. 
Then one verifies readily that 


d+ d’ = <(d,,- +, dn, > +, 41)> = <(0)>. 


It is well known, and easily verified, that under these circumstances the 


additive manifold D(A) is a group. 

LeMMA 1. Assume the validity of (SR.1) and (SR. 2) in the add A. 
(a) (SR.3) ts a consequence of the second associative law. 
(b) (SR.4) and (SR.5) are consequences of the third associative law. 


Proof. If x, y, z are elements in A such that e+ y=x2-+2=0, then 
i* + y* = a* + 2* —0*. But D(A) is a group [Proposition 1]. Hence 
y* = z* and y =z is a consequence of (II). ; 

If r+y=0, then x*+y*=—0*. Since D(A) is a group, we have 
y* + v* —0*; and y + 20 is a consequence of (III). Likewise we infer 
from y=0 and the existence of the sum that 


and that therefore «+ (y+ 2) =z [here we applied (III) twice]; and the 
other half of (SR.5) is verified likewise. 

For a convenient enunciation of the next proposition we define induc- 
tively the concept of i-fold contraction of a vector. Every vector is its own 
)-fold contraction. If w is an i-fold contraction of v, if w=r-+ (a,b) +8 
[where the vectors 7 and s may have length 0], and if the sum a+ 6 exists 
in A, then r+ (a+ 0) +5 is an (1+ 1)-fold contraction of the vector v. 
Instead of “i-fold contraction ” we shall often say only “ contraction ” [see 
3, Definition ]. 

One deduces easily from 2, (S.1) and 2, (S.2) that vectors are similar 
lo all their contractions. The following property is often useful. 


Lemma 2. The element ain A belongs to Sv if, and only if, the vector 


(a) is a contraction of the vector v. 


This is easily verified by complete induction. 


l if 
om 

orm 
use 
c) 

cl) 


720 REINHOLD BAER. 


LemMA 3. Jf the vector v(1) +-- +--+ v(m) ts an 1-fold contraction of 
the vector w, then there exist vectors w(j) with the following properties: 
v(j) tw an i(j)-fold contraction of w(j), w=w(1) +--+-+w(n) and 
-+2(n). 

The proof is effected by complete induction with respect to 1 after direct 
verification of the cases i= 0 and i—1. 


THEOREM 1. If the add A is self-reflexive in the strict sense, and if the 
vectors v and w over A possess a common contraction, then there exists a 
vector z over A such that both v and w are contractions of z. 


We precede the proof of Theorem 1 by the proof of the following special 
case. 


(1.1) Jf u+ (a+b) +0 ts an i-fold contraction of the vector w, then 
there exists a vector z such that both u+ (a,b) +- v and w are contractions 
of 2. 


Proof by complete induction with respect to 1. 


Ifi—0,u+ (a+b) + v—w is a 1-fold contraction of z =u + (a,b) 
+v. This shows the validity of (1.0). Assume now that 0 <i and that 
(1.j) has been verified for 0 =j <1. Suppose that u+ (a+ 6) + v is an 
i-fold contraction of the vector w. It follows from the definition of i-fold 
contraction that there exists a vector ¢ such that u+ (a+) + ¥ is a 1-fold 
contraction of ¢ and ¢ is an (i—1)-fold contraction of w. We distinguish 


two cases. 


Case 1. t—=u+(c,d)+v with a+b=—=c+d—e. 

We infer from Lemma 3 the existence of vectors p, g, r with the 
following properties: u is a contraction of p, (c,d) is a contraction of 4, 
v is a contraction of r; and w=p-+q-+r. Since A is self-reflexive in the 
strict sense, there exists a well determined element —a in A which satisfies 
a—a=0 and —a+e——a+ (a+b) =b. We let z—p-+ (a,—a) 
+q-+r. Then w is a 2-fold contraction of z. Since u, (c,d), v are con- 
tractions of p, g and r respectively, it follows that u + (a,—a) + (c,d) +0 
is a contraction of z Since c+d—e and —a+e=b, it follows that 
ut (a,b) + v is a 2-fold contraction of the latter vector. Hence both w 


and u-+ (a,b) + are contractions of z. 


Case 2. t has not the form discussed in Case 1. 
Since (a+b) is a 1-fold contraction of ¢, we may assume 


without loss in generality that 


trac 


Si 
exi 

+ 

con 
vali 
+8 
asse 

(anc 
cont 
1-fol 

is a 
the 
of q. 
of 
ave 

cont 
Thus 

the 
(SR. 
are si 
and u 
| 
pointe 


FREE SUMS OF GROUPS AND THEIR GENERALIZATIONS. 721 


t—p+(cd)+q+ (a+b) +0 with u—p+ (c+d4)+¢. 


Since ¢ is an (t—1)-fold contraction of w, we infer from (1.1—1) the 
existence of a vector z over A such that both w and p+ (c,d) +q-+ (a,b) 
+y are contractions of z. Since w+ (a,b) + is a 1-fold contraction of 
p+ (c,d) +q-+ (4,6) +1, it is likewise a contraction of z. 

This completes the inductive proof of (1.1). 


For convenience of inductive argument we restate Theorem 1 as follows: 


(i.j) If the vector c over A is an 1-fold contraction of v and a j-fold con- 
traction of w, then there exists a vector z over A such that both v and w are 
contractions of z. 


Proof by complete induction with respect to i. 


If 10, then vc is a j-fold contraction of wz. This shows the 
validity of (0.j). If i—1, then v—r-+ (a,b) +s and c=r-+ (a+b) 
+s and c is a j-fold contraction of w. Thus we obtain for 11 just the 
assertion (1.j) which we have verified before. 

Assume now that 1 <i and that (k.j) has been verified for OS hk <i 
(and every 7). If c is an i-fold contraction of the vector v and a j-fold 
contraction of the vector w, then there exists a vector p such that ¢ is a 
1-fold contraction of p and p is an (t—1)-fold contraction of v. Since ¢ 
is a 1-fold contraction of p and a j-fold contraction of w, we infer from (1. j) 
the existence of a vector g over A such that both w and p are contractions 
of g. Since p is an(i— 1)-fold contraction of v and an m-fold contraction 
of g, we infer from (i—1.m), (the inductive hypothesis), the existence of 
a vector z over A such that both v and q are contractions of z. Since w isa 
contraction of the contraction q of z, it follows that w is a contraction of z. 
Thus v and w are contractions of z, as we intended to prove. This completes 
the proof of (i.j) and of Theorem 1. 


Remark 1. It is worth noting that only the properties (SR. 1), (SR. 2), 
(SR. 4) and the first half of (SR. 5) have been used in the proof of Theorem 1. 


CoroLtuaRy 1. The vectors v and w over the strictly self-reflexive add A 
are similar if, and only if, there exists a vector z over A such that both v 
and w are contractions of z. 


Proof. If v and w are contractions of z, then v~z~vw, as has been 
pointed out before ; and v ~ w is a consequence of the transitivity of similarity. 
If conversely v and w are similar vectors, then one deduces readily from the 


2 
3 
v 


722 REINHOLD BAER. 


definition of similarity the existence of vectors v(i) with the following 
properties : 

v= v(0), v(t) is a contraction of v(i + 1) or v(i+1) is a contraction 
of v(t), v(n) =w. 


We are going to construct by complete induction with respect to 1 vectors 
z(t) such that v and v(t) are both contractions of z(7). For 10 we may 
clearly choose z(0) =v. Thus suppose that we have already constructed a 
vector z(1) of which v and v(t) are contractions. If in, then we have 
obtained the desired vector z. If i< n, then we distinguish two cases. 


Case 1. v(i+1) is a contraction of v(i). 

Then v(i+1) is also a contraction of z(1); and so we may let 
z(i +1) 

Case 2. v(i+1) is not a contraction of v(). 

Then v(t) is a contraction of both z(7) and v(t+1); and it follows 
from Theorem 1 that there exists a vector z(i + 1) over A such that both z(7) 
and v(t+ 1) are contractions of z(1+-1). Since v is a contraction of z(%), 
v is likewise a contraction of z(1+ 1). 

This completes the induction. Hence there exists a vector z=2z(n) of 
which both v and v(n) =w are contractions. This completes the proof. 


CoroLLaRy 2. The first and second associative law are equivalent 


We 


properties of self-reflexive adds in the strict sense. 


Proof. It has been shown in section 8 that the first associative law is a 
consequence of the second one. Assume now that A is a self-reflexive add 
in the strict sense and that the first associative law is satisfied by A. Con- 
sider elements a, b in A such that a* = b*. Then (a) ~ (bd); and it follows 
from Corollary 1 that there exists a vector v over A of which the vectors (a) 
and (b) are contractions. We infer from Lemma 2 that the elements a and b 
belong both to Gv; and a = 6 is now a consequence of the first associative law. 
Thus a=b is a consequence of a* = b*; and this shows the validity of the 


second associative law. 


Remark 2. It may be of ‘interest to obtain results similar to the pre- 
ceding ones without the use of (SR.5). In this respect we state without 


proof the following fact: 
The second associative law is satisfied by the self-reflexive add A (in the 


weak sense) if, and only if, the first associative law is satisfied by A and 
(0) ~ (a,b) implies 0 =a -+ 


fo 
as 
( 
( 
(c 
tre 
we 
(b 
va 
it 
in 
anc 
con 
fae 
(C. 
is ¢ 
Her 
one 
in 
(SK 
+e 
(a) 
poin 
sequ 
for 
(a) 
infe 
b, 


FREE SUMS OF GROUPS AND THEIR GENERALIZATIONS. 723 


THEOREM 2. Jf A is a self-reflexive add (in the weak sense), then the 
following conditions are necessary and sufficient for the validity of the strong 
associative law in A. 


(a) A is a self-reflexive add in the strict sense. 
(b) The weak associative law is satisfied by A. 


(c) If (0) ~v, and if the length of v is greater than 2, then v is con 
tractable. 


Proof. Assume first the validity of the strong associative law. Then 
we have shown in section 3 that all four associative laws hold in A. Thus 
(b) is true; and the validity of (a) is a consequence of Lemma 1. The 
validity of (c) may be inferred from 3, Theorem 1. 

Assume conversely the validity of the conditions (a), (b), (c). Then 
it follows from (a), (b) and Corollary 2 that the second associative law holds 
in A. If a and b are elements in A such that (a) ~ (0b), then a* = b*; 
and a = b is a consequence of the second associative law. Next we prove by 
complete induction with respect to n (for 1 < n) the validity of the following 
fact. 

(C.n) If the vector v of length n is similar to a vector of length 1, then v 
is contractable. 

To prove (C.2) consider elements a, b, c in A such that (a) ~ (b,c). 
Hence (0) ~ (—a,a) ~ (—a, b,c); and it follows from (c) that at least 
one of the sums —a+0 and b+ ec exists in A. If —a+b exists 
in A, then (—c) ~ (—a,b,c,—c) ~(—a+0,0)~(—a+b); and 
—c=—a+b is a consequence of the second associative law. Applying 
(SR. 5) twice we find that a—c=a-+ (—a+ )) =bandb+c= (a—c) 
+e=a. If —a-+b does not exist in A, then b+ exists in A. Hence 
(a) ~-(a, —a, b,c) ~ (0,b +c) ~ (b+ ¢); and this implies, as has been 
pointed out before, a—=6-+ c. Thus we have in either casea=b-+c. Con- 
sequently (C.2) (and the third associative law) are true. 

Assume now that 2 < n and that the validity of (C.j) has been verified 
for 2<j<n. Assume that a,b,,---,b, are elements in A such that 
(a) ~ (b,,- +, bn). Then (0) ~ (—a,a) ~ and we 
infer from Condition (c) the existence of at least one of the sums —a-+ D,, 
b, + bo: +,bn1 +b, in A. If —a-+), exists in A, then 


(—(—a + b,)) ~ (—(— a+ —a, +, bn) 


724 REINHOLD BAER. 


and it follows from (C. n—1) that (b2,- - -, bn) is contractable. Thus the 
vector (0,,---,6n) is in all cases contractable. This proves (C.n) and 
completes the induction. 

Now we have verified the validity of Condition (ii) of 3, Theorem 1; 
and this shows the validity of the strong associative law in A, as we claimed. 


CoroLuary 3. The strong associative law is satisfied by the strictly self- 
reflexive add if, and only if, (0) is a contraction of the vector v whenever 
(0) ~»v. 


Proof. The necessity of our condition is a consequence of the fact that 
©(0) consists of 0 alone; and that (0) is a contraction of v whenever 0 
belongs to Gv. To prove the sufficiency of our condition we have to verify 
only the validity of the second associative law (by Theorem 2); and this 
is easily done, since a*=—b* implies (a) ~ (b), and since this implies 
(0) ~ (b,—a) so that (0) is a contraction of (b,—a) or 0=b—a or 
b—a. 


5. Freedom and independence. In this section we shall establish the 
relations between the methods and results of the first four sections and the 
problems of group construction, in particular that of generation of groups 
by subadds. Here we say that the additive manifold M is generated by its 
subadd S, in-symbols: M = {8}, if every element in M is the sum of some 
elements in S. If § happens to meet requirements (SR.1) and (SR. 2) of 
section 4, then {S} is easily seen to be a group. 


DEFINITION 1. The subadd S of the additive manifold M is a free add 
of generators of M [or M is freely generated by S], if 


(a) M = {8S}, and if 


(b) there exists to every homomorphism § of the add S into some additive 
manifold N a homomorphism t of M into N such that «8 = at for every x in 8S. 


It is readily seen that the homomorphism t is uniquely determined by 8 
and maps {S} upon {S8}. (See Bates [1] for a discussion of free generation.) 


THEOREM 1. Suppose that A is an add. 


(1) If the additive manifold M is freely generated by A, then there exists 
an isomorphism of M upon D(A) which induces the natural homorphism 
in A, 


(2) If A isa free add of generators of the additive manifolds M’ and M”. 


| 
the 
ele 
(3 
ge 
co 
. 
pe 
(a 
(b 
ve 
ma 
effe 
pr 
( 
(i 
0 
(ii 
by 
Si 
ho 
tha 
the 


FREE SUMS OF GROUPS AND THEIR GENERALIZATIONS. ” 725 


then there exists an isomorphism of M’ upon M” which leaves invariant every 


element in A. 


(3) The third associative law holds in A if, and only if, A ts a free add of 
generators of some additive manifold. 


Proof. (1) is an immediate consequence of 2, Theorem 2 [if we let there 
in particular r 1]; and (2) is an immediate consequence of (1). It is a 
consequence of 2, Theorem 1 that D(A) is freely generated by A*; and 
(3) may be deduced from this fact and 3, Corollary. 


DEFINITION 2. The subadd S of the additive manifold M is an inde- 
pendent add of generators of M, if 


(a) M = {8S}, and if 


(b) Som 5, for in S and 1 <n imply contractability of the 
vector *.8n) over the add S. 


Note that the additions occurring in (b) are effected in the additive 
manifold M whereas the contraction of the vector (s:,---°,8,) has to be 
effected within the add S. 


THEOREM 2. If Conditions (SR.1) and (SR. 2) of section 4 are satisfied 
by the subadd S of the additive manifold M = {8}, then the following 
properties of M and § are equivalent. 


(i) S is an independent add of generators of M. 


(ii) If + are elements in S such that Sn, then 
0 belongs to Sn). 


(iii) M is freely generated by S and the strong associative law is satisfied 
by the add 8. ‘ 


Proof. We note first that M is a group, as has been pointed out before. 
Since § is a subadd of a group, it follows from 3, Lemma 2 that the natural 
homomorphism is an isomorphism; and now it is a consequence of 4, Lemma 1 
that S is strictly self-reflexive. 

Assume that (i) is true and that O—s,+---+s8,. Then either 
n=1 and 0s, belongs to Gs(s,) or else we infer from Definition 2, (b) 
the existence of an integer i with the following properties: 0<i< n, and 
8 + Siz, —=t is an element in S. Then 


=s +: Sing bE + Sing 


t 
y 
| 

| 


726 REINHOLD BAER. 


and now it is clear how to prove (ii) by complete induction with respect to n 
(together with 1, Lemma 1, (b)). 

Assume now the validity of (ii). Since S is a subadd of the group J, 
the identity is an isomorphism of S into M. Hence we may infer from 2, 
Theorem 1, (c) the existence of a homomorphism g of D(S) into M such 
that 2*g =z for every x in S. Clearly we have D(S)g = {S*}g = {S*g} 
={S}=—M. Next we ncte that D(S) is a group too [4, Proposition 1]. 
If w is an element in D(S) such that wg = 0, then there exist elements 
Wi,° *,Wm in S such that w= -,Wm)>; and we find that 


wn. 


It follows from (ii) that 0 belongs to Gs(w.,- + -+,Wm); and it follows 
from 8, Lemma 1 that (0) ~ (w.,- + -+,wWm) (over S). Hence w=0 and 
consequently g is an isomorphism of D(S) upon M; and now we are assured 
of the existence of the reciprocal isomorphism gq" of M upon D(S). If § 
is a homomorphism of S into the additive manifold N, then there exists a 
homomorphism f of D(S) into H such that 2*f = xh for x in S (2, Theorem 
1, (c)). Hence rh = x*f = for x in S; and the homomorphism 
of M into N induces § in S. This shows that MV is freely generated by S. 
If the vectors (0) and (s,,- - +,8,) are similar over S, then 


0=—0g = (s*, +: -+ s*,)qg=s*,.g +: s*.gq=s,+: ° 


and it follows from (ii) that 0 belongs to Ss(si,- - +, Sn). ° Hence it follows 
from 1, Lemma 1 (a) that the vector (s,,- - -,s,;) is contractable over 8. 


Thus we have shown that the conditions (a), (b), (c) of 4, Theorem 2 are 
satisfied by S; and this shows the validity of the strong associative law in S. 
Hence (iii) is a consequence of (ii). 

Assume now the validity of (iii). Then we infer from Theorem 1, 
(1) the existence of an isomorphism bp of JJ upon D(S) such that x* = rp 
for x in S. Suppose now that 1 <n and that the elements s; in S satisfy 
S =8,+:--+s, [in M]. Then we have 


= 8*o = Sod) = 5,0 8%, 
= OF (8) ~(s:,° [over S]. 


It follows from the strong associative law that s) belongs to Gs(so) 
= and the contractability of the vector 
[over S] is a consequence of 1, Lemma 1, (a). Hence S§ is an independent 
add of generators of M [Definition 2]. Thus (i) is a consequence of (iii). 
This completes the proof. 


| ref 


bas 


(a 
ma 
ad 
3 
ref 
| anc 
ext 
ri 
in 
equ 
al, 
is 
| the 
| Suc 
of 1 
Su 
and 
let 
in ¢ 


) 


it 


FREE SUMS OF GROUPS AND THEIR GENERALIZATIONS. G27 


CoroLuary 1. The strong associative law is satisfied by the self- 
reflexive add § if, and only if, S is an independent add of generators of one 


| (and essentially only) one group. 


This is easily deduced from Theorems 1 and 2. 


THEOREM 3. Jf A is an independent add of generators of the additive 
manifold M, and if the subadd S of A is closed in A, then S is an independent 
add of generators of the additive submanifold {S} of M. 


This is an almost immediate consequence of Definition 2 and 1, Definition 


CoroLuary 2. Jf the strong associative law is satisfied by the self- 
reflexive add A, if the self-reflexive subadd S of A is closed in A, and if a 


| and $ are the natural homomorphisms of A and § respectively, then there 


exists an isomorphism g of D(S) upon {Sa} such that x3qg = 2a for every 


rin 8, 

This is easily deduced from the preceding results. 

6. Concretions and amalgams. The following definition expresses a 
basic principle for the construction of adds. 


DEFINITION 1. Jf ¢ is a system of subadds of the add A, if every element 
in A belongs to at least one of the subadds in $, and if the validity of the 
equation a+ b=c in A implies the existence of an add in @ which contains 
a, b, c, then A is the concretion of the adds in @. 


If A is the concretion of the adds in ¢, if B is some add, and if there 
is given to every add XY in ¢ a homomorphism §(X) of X into B such that 


uh(X) = ub(¥) for every u in X11 Y, 


then there exists clearly one and only one homomorphism § of A into B 


such that 


uh = uh(X) for every u in X. 


Thus homomorphisms of A are obtained by “ concretion ” of homomorphisms 
of the adds in @. 

The construction of adds by concretion of given adds proceeds as follows: 
Suppose that @ is a set of adds such that the cross cut of any two adds Y 
and Y in @ is either vacuous or a common subadd of both Y and Y. Then 
let 4(6) be the set of all the elements belonging to at least one of the adds 
in 6; and define addition in A(@) by the rule: 


1, 
*h 
1} 
ts 
3. 
d 
h 
a 
| 
1, 
| 
y | 
). 


728 REINHOLD BAER. 


a-+ b=c if, and only if, there exists at least one X in @ which contains 


a, b and ¢ and in which add X the equation a + b =c is valid. (See Bates | 


[1] for the principle of concretion. ) 


EXAMPLE 0. Suppose that G and H are two groups which have no 
elements in common. The concretion of G and H is then an add without 0, 
since the null-element of G cannot be added to any element in H ete. If» 
is a vector over the concretion C of G and H, then Gev is not vacuous if, 
and only if, all the coordinates of v belong to G or all the coordinates of » 
belong to H; and from this fact one easily deduces the validity of the strong 
associative law in C. 

The situation indicated in Example 0 is one that we want to avoid. 
This leads us to the following concept. 


DEFINITION 2. If ¢ is a system of subgroups of the add A, if A 1s the 


concretion of the groups in ¢, and if the cross cut X (\ Y of any two groups 
X and Y in ¢ is a subgroup of both X and Y, then A is the amalgam of the 
groups in ¢. 

The concretion of the groups G and H without common elements, as 
discussed in Example 0, is clearly not their amalgam. 


EXAMPLE 1. Denote by {a}, {b}, {c}, {u}, {v}, {w} cyclic groups, not 0, 
but of equal order; and let 


U = {a} ® {b}, V={b} {c}, W={c} @ {a}, T —{u} {v} 
We identify elements according to the following rule: 

iu=i(a—b), iv = i(b—c), iw =i(c—a) for integral 7. 
Then the groups U, V, W and T have the following common subgroups: 


VAW={}, WNU={a}, 
T= {u}, VOT = {r}, {wv}. 


Thus we may form the amalgam A of these four groups U, V, W and T. But} 
the weak associative law is not satisfied by the amalgam A, 
since the summation G(a, —b, b, —c,c,—a) is easily seen to contain both) 
0 and the element u+ v-+ w in T which is not 0. 
It is easily seen that amalgams are always strictly self-reflexive adds. 
It is therefore a consequence of 4, Corollary 2 that the first and second 
associative laws are equ.valent properties of amalgams. 


U 
gi 
he 
Li 
P 
h 
be 
the 
W. 
th 
an 
hor 
has 
by 
of 
gro 
of 
defir 

the 
of A 
whic 
of t 
defin 
table 

(p 


{w 


ut 


th 


1s. 
nd 


FREE SUMS OF GROUPS AND THEIR GENERALIZATIONS. 729 


EXAMPLE 2. Suppose that the group G is the direct sum of three groups 


H, J, K none of which is 0, in symbols: G=H @J @K. Let VU =H OJ, 
V=J@K, W=K@H. Denote by A the concretion cf the three groups 


U, V and W. Since U {) V is their common subgroup J etc., A is the amal- 
gam of U, V and W. The identity mapping of A into G@ is a one-to-one 
homomorphism of the add A into the group @; and so it follows from 8, 
Lemma 2 that the natural homomorphism is one to one. Hence A has 


Property (II). 

Denote by h, j, & elements, not 0, in H, J and K respectively. Then 
h—j belongs to U, j;—k to V and h—k to W. Since h—j does not 
belong to V nor to W, 7 —k does not belong to W nor to U, it follows that 
the sum of h —j and j —k is undefined in the concretion A of U and V and 
W. But the following vector similarities are easily verified to hold over 
the add A: 


(A —j,j—k) (h, — j, J, —k) {(h, — (48); 


and this implies (h — j)* + (j — k)* = (h —k)*, showing that the natural 
homomorphism of A into D(A) is not an isomorphism. Thus the amalgam A 
has Property (11), but not Property (III). 

EXAMPLE 3. Suppose that the group G is the direct sum of four groups 

H, J, K and L none of which is 0, in symbols: G =H @®J ® K @® L. Denote 
by A the concretion of the subgroups H@J, J@K, K@L and LOH 
of G. It is easily seen again that the add A is the amalgam of these four 
groups. 
The identity mapping of A into G is clearly a one-to-one homomorphism 
of A into G. Suppose now that 2 and y are elements in A whose sum 2, as 
defined in G, also belongs to A. Then one verifies by a simple discussion of 
the possible cases that. 7, y and z all belong to one of the four groups 
H @®J,- +: -; and thus it follows that the identity mapping of A into G is 
an isomorphism. It follows from 3, Lemma 2 that the natural homomorphism 
of A into D(A) is an isomorphism. 

Consider now elements p, g, r, s in H, J, K and L respectively, none of 
which is 0. Then p—gq, g—r, r—s and s—p belong to A, though none 
‘of the sums of any two consecutive elements (like (p—q) + (q—r)) is 
defined in A. Thus the vector (p—gq,qg—r,r—s,s—p) is not contrac- 
table over A. On the other hand the following vector similarities hold over A: 


(p—q,qg—r", r— 8,8—p) —~(P,— — — 8, 8, — p) = (p, — p) 
~ (0). 


no 

0, 

v 

if, 

v 

ng 

id. 

he | 
he 

as 

0, 

| 


730 REINHOLD BAER. 


Thus the strong associative law is not satisfied by the amalgam A, though 


the third associative law is valid in A. 


We infer from 5, Theorems 1 and 2 that this amalgam is a free add of 
generators of some group, though it is not an independent add of generators 
of any group. 

What appears in the literature as “a free sum of groups with amal- 
gamated subgroups” (Schreier [1], Neumann [1] and others) is in the 
present terminology a group for which a given amalgam of groups is either 
a free or an independent add of generators. The preceding examples illustrate 
some of the possibilities arising in connection with the problem of constructing 


“free sums of groups with amalgamated subgroups.” 


7. Schreier’s theorem. The sole objective of this section is the proof 


of the following proposition. 


THeoreM. Jf the add A is the amalgam of the groups in the add 4, 
and if there exists a subgroup U of the add A such that U=X{)¥Y for 
any two distinct subgroups X and Y in ¢, then the strong associative law ts 
satisfied by the add A. 


The proof of this theorem will be effected in a number of steps. 


(1) If Sv is not vacuous, and if the vectors r and s are 1-fold contractions 
of v, then r and s possess either a common contraction or there exists a 1-fold 
contraction t of v such that both r,t and s,t possess common contractions. 


Assume that this is not true. Then clearly rs4s. Furthermore it is 
impossible that v-=h +h, r=)’ +k, s=h-+ where and are 1-fold 
contractions of h and & respectively, since then h’ + k’ would be a common 
contraction of r and s. Thus we may assume without loss in generality that 
v=h-+ (a,b,c) +k, r=h+ (a+b,c) +h, s=h+(ab+c)+hk. If 
a, b, c were in the same subgroup of A, then h + (a+6-+c) +k would bea 


common contraction of r and s. Hence a and c belong to different subgroups 
in ¢; and it follows from our hypothesis that b belongs to U and that } 
consequently may be added to every element in A. 

If there exists a 1-fold contraction h’. of h, then t =h’ + (a,b,c) +h 
would be a 1-fold contraction of v; and r,t would have the common con- 
traction h’ + (a+ 6,c) +, and s,¢ would have the common contraction} 
h’+ (a,b +c) +k. In this fashion we see that neither h nor & can have| 


a 1-fold contraction. 


A, 

wot 
pos 
the 


trac 
this 
Her 


(2... 
1-fo 


[for 
Sw 
for 
that 
ther 
prov 
cont: 
belox 
If v’ 
that 
posse 
As b 
Sd a 


w is 


cedin, 
gener 
of Se 
Thus 


course 


| 
and 
that 


FREE SUMS OF GROUPS AND THEIR GENERALIZATIONS. 731 


If h=h’ + (d), then d+ a exists if, and only if, d+a+ 6 exists in 
A, since b is in U. In this case we could let t= h’ + (d+ a,b,c) +k; and 
t,r would have the common contraction h’+ (d+a+ b,c) +h, and t,s 
would have the common contraction h’+ (d+a,b+c) +k. This is im- 
possible ; and likewise we see the impossibility of k = (d) + k’ together with 
the existence of one of the sums c+ d or (b-+-c) +d. 

From this discussion it follows that r and s are the only 1-fold con- 
tractions of v, and that neither 7 nor s possesses a 1-fold contraction. But 
this contradicts our hypothesis that Gv is not vacuous [see 1, Lemma 1, (a) ]. 
Hence (1) is true. 


(2.n) If ais an element in Sv, if v is a vector of length n and if w is a 
1-fold contraction of v, then a belongs to Sw. 


This proposition we prove by complete induction with respect to n 
[for 1<n]. If v= (b,c), then a=b+c and w=(a) so that a is in 
Sw too. Hence we may assume that 2 < n and that (2.j) has been verified 
for 1<j<n. Suppose that v is a vector of length n, that a is in Gv and 
that w is a 1-fold contraction of v. It follows from 1, Lemma 1, (a) that 
there exists a 1-fold contraction v’ of v such that a belongs to Gv’. This 
proves our contention if v’ =w. If v’4w, and if v’ and w possess a common 
contraction c, then it follows from the assertions (2.j) with 7 <n that a 
belongs to Gc; and it follows from 1, Lemma 1, (b) that a belongs to Gw. 
If v’ and w do not possess a common contraction, then it follows from (1) 
that v possesses a 1-fold contraction ¢ with the following properties: ¢ and v’ 
possess a common contraction c and ¢ and w possess a common contraction d. 
As before we verify successively that a belongs to Gc, hence to Gt, hence to 
Sd and therefore to Sw. This completes the proof of (2.n). 

From(2.n) and 1, Lemma 1, (b) we deduce that Gv = Gw whenever 
w is a contraction of v; or in the terminology of section 2, (S.1) whenever 
vand w are directly similar. But now one verifies immediately [see 2, (S. 2) ] 
that v ~~ w implies Gv = Gw. Hence the strong associative law holds in A, 


as we claimed. 


Remark 1. It follows from 5, Corollary 1 that the assertion of the pre- 
ceding theorem is equivalent to the fact that A is an independent add of 
generators of some group (; and it is easily seen that G is, in the terminology 


on | of Schreier [1], “a free sum of groups with one amalgamated subgroup.” 


ve 


'Thus we have obtained a new proof of Schreier’s theorem. It would, of 
‘course, have been possible to obtain a proof of the preceding theorem by 


16 


f 

: 
e 

r 

e 

is 

1s 

ld 

8. 

is 

ld 

yn 

at 

If 

a 

b 

| 


732 REINHOLD BAER. 


direct reference to Schreier’s original theorem and 5, Corollary 1; but it 
seemed appropriate to design a proof based on direct verification of the 
strong associative law. 


Corottary. The strong associative law is satisfied by every amalgam 
of two groups. 


This is obviously a special case of the preceding theorem. 


Remark 2. In the subsequent applications we shall only use this Cor- 
ollary; and the preceding theorem will be recovered as a special case of some 
of the later results which are derived from this corollary. But the proof of 
our theorem would not have been simplified if we had only considered 


amalgams of two groups. 


8. The structure of the amalgamations. If the add A is the amalgam 
of the groups in the set ¢, and if X and Y are two different groups in 4, 
then the intersection Y {) Y may be considered as having arisen by amal- 
gamation of a certain subgroup of X with a certain subgroup of Y. The 
examples of Section 6 and Schreier’s theorem show that the question whether 
or not the strong associative law is satisfied by some amalgam will depend 
on the structure of these amalgamations; and the present section is devoted 
to a study of this question. 


LemMMA 1. Suppose that the add A is the concretion of the group G and 
the self-reflexive add R, that R and G and A have the same null-element, that 
Rf) G is an independent add of generators of the group G, and that the 
strong associative law is satisfied by A. Then the strong associative law 1s 
satisfied by R. 


Proof. It follows from our hypotheses that 4, (SR.1) and 4, (SR. 2) 
are satisfied by the add A. Hence we deduce from the strong associative law 
and 4, Lemma 1 that A is strictly self-reflexive. Consequently — x is uniquely 
determined by z; and now it is easy to see that R and R[) G are strictly 
self-reflexive too. 

It is a consequence of the strong associative law that A is an indepen- 
dent add of generators of an essentially uniquely determined group H (5, 
Corollary 1). Since FR is a subadd of A, and A is a subadd of the group H, 
R itself is a subadd of the group H. Denote by K the subgroup of H, 
generated by R. Since RF is self-reflexive, every element in K is a sum of 
elements in R or K={R}. We prove: 


(1.a) BR is an independent add of generators of the group K. 


Ne 


() 
| 
Ww 
te 
is 
( 
in 
or 
an¢ 
nat 
fol 
(* 
(w 
bel« 
is 
con. 
que 
tha | 
R, 


it 


1m 


FREE SUMS OF GROUPS AND THEIR GENERALIZATIONS. 733 


Suppose that 2=n, that r,,--+,7, are elements in R and that 
0=—ri.+:::+7n, We want to prove the contractability of the vector 
over R. If this were not true, then for 


would not belong to R. We have to show that this hypothesis leads to a 
contradiction. We note first the existence of some r; that does not belong 
to G. For otherwise - -,7,) would be a vector over R() G. But G 
is an independent add of generators of G; and the equation 0 +: 1, 
would hold in G too and would imply the contractability of the vector 
over Rf) G and hence over R. Consequently there exist 
integers n(j) with the following properties: 


1Sn(1) <n(2) <- Sn; 
r; belongs to R [) G if, and only if, ix4n(j) for every j. 
Now we define: 
If 1< n(1), then g(0) =r, 4+ 
if n(t) 1< n(t+1), then g(t) 
if n(k) <n, then g(k) 


It follows from our choice of the integers n(j) that g(i) is either undefined 
or an element in G. Since the sum of the r; (in their natural order) is 0, 
and since addition in H is associative, the sum of the r,,;) and g(j) in their 
natural order is 0 too. Indicating typical terms only this last sum has the 
following form: 


(where n(i) + 1=—n(i+1) and n(j) +1< n(j+1).) 


We note that the r,,j;, belong to R, but not to G, whereas thé g(j) 
belong to G. Hence all the summands of (*) belong to the add A. But A 
is an independent add of generators of the group H; and so there exist two 
consecutive summands in the sum (*) whose sum belongs to A. Conse- 
quently there arise three possibilities : 


Case 1. There exists an i such that n(i) - 1—n(i+1) and such 
that + belongs to A. 

Since by hypothesis the sum of these two elements does not belong to 
R, and since A is the concretion of R and G, this would imply that 
ANA + belong to G. But this contradicts our choice 
of the integers n(j). Thus this case cannot occur. 


| 
or- 
me 
of 
ed 
am 
al- 
he 
her 
ond 
ted 
und 
hat 
the 
2) 
law 
ely 
tly 
yen- 
(5, 
H, 
H, 
. of 


734 REINHOLD BAER. 


Case 2. There exists an integer 1 such that g(1— 1) and ry,i) are both 
defined and such that g(i—1) + racy) exists in A. 
Since 7n,i) does not belong to G, and since A is the concretion of R 
and G, it follows that g(i—1), rai) and g(i—1) + Tai) all belong to R. 
Thus g(i—1) belongs to R{)G. But 


1) = i-1)+1 + Tn(i)-1 


(letting n(0) = 0, if necessary) where g(t—1) as well as the summands 
of the right side belong to Rf) G. It is impossible that n(i—1) +1 
=n(t) —1, since then g(t—1) —fnciy-1 and consequently + 
would be an element in R{) G. Thus the sum defining g(i—1) contains 
more than one summand. However, @ {) # is supposed to be an independent 
add of generators of G. Consequently the defining equation for g(i—1) 
implies the contractability of the vector over Rf) G. 
Thus the sum of two consecutive r’s belongs to R [) G; and we have obtained 
a contradiction again. Hence this case cannot occur. 


Case 3. There exists an integer 1 such that rai, and g(t) are both 
defined and such that belongs to A. 

The impossibility of this case 3 is shown exactly as that of case 2. 

Thus we have shown: If r,,- + -,7, are elements in R such that 1 <1 
and then the vector is contractable over 
From this fact one deduces easily the validity of Condition (ii) of 5, 
Theorem 2. Hence (1.a) is true and the strong associative law is satisfied 


by R, as we claimed. 


LemMA 2. Suppose that the add A is the concretion of the group G 
and the self-reflexive add R, that R{\G is a subgroup of G. Then the 
strong associative law holds in A if, and only tf, it ts satisfied by R. 


Proof. We note first the fairly obvious facts that A is self-reflexive and 
that R is closed in A (For if a,b are elements in R, a+ b=c in A, then 
either a, b,c belong to R, as we want to show, or a, b,c belong to G, since A 
is the concretion of R and G. But in the latter case a,b and hence a + b =¢ 
belong to the subgroup R [) G of G, so that a+ 6 =c belongs to RF in either 
case. 
If the strong associative law holds in A, then it follows from 3, 
Proposition 1, (b) that the strong associative law holds in the closed subadd 


R of A. 
Assume conversely the validity of the strong associative law in R. Then 


both 


of R 


ands 
+1 
Tn i) 
tains 
ident 
= 


1ined 


both 


<n 
ar 
of 5, 
isfied 


up G 
, the 


and 
then 
ce A 


ither 


n 3, 


badd 


Then 


FREE SUMS OF GROUPS AND THEIR GENERALIZATIONS. 735 


we infer from 5, Corollary 1 the existence of one and essentially only one 
group H such that FR is an independent add of generators of H. Since the 
intersection of R and G is the subgroup S=R{)| G of G, we may assume 
without loss in generality that 


(1) S=RNG=—HNG. 


We form the concretion K of the two groups H and G, subject to the rule (1). 
We show 


(2) A is a subadd of K such that A {]) H=R. 


It is clear that A is contained in K. Consider elements a,b,c in A. 
If a+ bc holds in A, then either a, b,c are in R and a+ bc holds in 
R and hence in H and in K or else a,b, c are in G anda+ b=c holds in G 
and hence in K. If conversely a+ bc holds in K, then either a, b,c are 
in G and a+ bc holds in G and hence in A or else a,b,c are in H and 
a+b=c holds in H. But if the elements a,b,c in H belong to A, then 
they belong to the subadd R of H and so a+ b=c holds in RF and hence 
in A. This proves (2). 


(3) K is the concretion of A and H. 


It is clear that every element in K belongs to A or to H, because every 
element in K belongs already to G=A or to H. Suppose now that a, b,c 
are elements in K such that a+6—c. Then either a,b,c are in H and 
a+b =c holds in H or else a, b, c are in G anda+b=c holds inG@SA, 
since K is the concretion of G and H. 


(4) The strong associative law is satisfied by K. 


This is a consequence of Schreier’s theorem (7, Corollary), since, by (1), 
K is the amalgam of two groups H and G. 

Since the subadd A {] H=R of K is an independent add of anaes 
of the group H, it follows from (1) to (4) that we may apply Lemma 1, 
showing that the strong associative law is satisfied by A. This completes 
the proof. 


Proposition 1. Suppose that the add A 1s the amalgam of the finitely 
many groups +, Gn meeting the following requirements: 


(a) The join A(t) of G,,- - +, Gi ts their concretion. 
(b) A(t) 1) Gia is a subgroup of Gist. 


Then the strong associative law is satisfied by A. 


736 REINHOLD BAER. 


Proof. Note that every A(i) is the amalgam of G,,- - -, Gi, since A is 
the amalgam of all the Gi, and since A(i) is the concretion of G,,- - -, Gi. 
Note furthermore that A(1) = G, and A(n) =A. 

We are going to prove by complete induction with respect to 1 that every 
A(t) satisfies the strong associative law. This is certainly true for the 
group A(1). Thus we assume that 1 <i and that the strong associative 
law holds in A(i—1). It follows from (a) that A(t) is the concretion of 
the group G; and of the self-reflexive add A(i— 1); and it follows from (b) 
that the cross cut of A(t— 1) and G; is a subgroup of Gi. It follows from 
Lemma 2 that the validity of the strong associative law in A(i—1) implies 
its validity in A(t). This completes the induction. Consequently the strong 
associative law is satisfied in particular by A(n) = A. 


Remark 1. The need for condition (a) of Proposition 1 arises from the 
following fact. Suppose that a, b,c are elements in G,, G2, Gs respectively ; 
and suppose thata+b—c. The fact that A is the amalgam of G,,-- -,G, 
implies the existence of some G; which contains a,b,c such that a-+b=—c 
_ holds in G. But we are not assured that i may be chosen from 1, 2,3 unless 
we impose condition (a). 

It is obvious how to deduce criteria for the validity of the strong asso- 
ciative law by a formal combination of Proposition 1 and 3, Proposition 2. 
More interesting than a general criterion of this type seem some special 


instances. 


THEOREM 1. Suppose that the add A is the amalgam of the groups in 
and that there exists a group U in } with the following property: 
(*) If X, Y, Z are three distinct groups in $ suck that X () Y is neither 
part of U nor of Z, then Z() YS YX. 


Then the strong associative law is satisfied by A. 


Proof. Suppose that G,,---,G, are a finite number of groups in ¢ 
which are all different from U. Denote by J the set theoretical join of U, 
°°, Gn. We deduce a number of properties of J. 


(1.1) If X is a group in ¢, then X()J=X([]U or else there exists an 
integer m such that 1S mSnand X J =X) Gn. 


This proposition is certainly true, if XY is one of the G;. Hence we 
may assume that X ~G; for every 1. The proposition is likewise true, if 
XM Gi SU for every i. Thus we may assume now without loss in generality 
that there exists an integer & with the following properties: 


is 
( 
fc 
k 
f 
of 
bi 
th 
( 
T 
ho 
X 
ho 
si 
nu 
F 
gro 
is § 
strc 
and 


vg 


her 


FREE SUMS OF GROUPS AND THEIR GENERALIZATIONS. 737 


OSk en; XNGSU forisk; XNGLO for k<cisn. 
We prove: 
(a) Ifk <i, then UN GK. 
This follows by applying (*) on the triplet Gi, X, U in ¢, since Gi [) X 


is not part of U. 
(b) Ifk<iandk<j, thn G;QOXSXNG or G G. 


GX EX! G;, then is neither part of U nor of G;. 
Thus we may apply (*) on the triplet Gi, X, G;; and it follows that 
SX) Gi. This proves (b). 

It follows from (b) that the subgroups X¥ () Gi of X with k<isSn 
form a finite ordered set. Hence there exists an integer m such that 
k< m<n and such that k < implies X¥ Gm. It follows 
from (a) that G,. Ifisk, then it follows from our choice 
of k that G =X) U; and this implies Gi; SX Gm. Com- 
bining all these facts it follows that X (] J =X) Gn; and this completes 
the proof of (1.1). 


(1.2) J is the concretion of U, Gn. 


Suppose that x,y,z are elements in J such that «+ y=—z holds in A. 
Then there exists a group X in ¢ which contains 2g, y,z such that r+ y=: 
holds in X. It follows from (1.1) that cither X (VJ =X] U-or else 
X Gn for some m. In the first case z,y,z are in U and 
«+ y =z holds in U; and in the second case 2, y, z are in Gy» and 
holds in G,». This proves (1.2). (Note that it is possible to prove by quite 
similar arguments the fact that J is closed in A.) 

Suppose now that F is a finite subset of A. Then there exists a finite 
number of groups P,,- - -,P, in ¢ which are all different from U such that 
F is part of the join of U,P:,---+,P». Denote by J(i) the join of U, 
P,,---,Pi. Then J(0) =U and FSJ(p). It follows from (1.2) that 
J(t) is the concretion of U,P,,--+,Pi; and it follows from (1.1) that 
J(i) 1 Pis is a subgroup of Pi,.—remember that A is the amalgam of the 
groups in ¢. It follows from Proposition 1 that the strong associative law 
is satisfied by J(p). Consequently it follows from 3, Proposition 2 that the 
strong associative law is satisfied by A. 


Remark 2. Since the antecedent in condition (*) is symmetric in X 
and Y, this condition implies actually the following stricter statement : 


is 
Gi. 
ery 
the 
ive 
of 
b) 
om 
lies 
ong 
the 
ly; 
ess 
2. 
ial 
¢ 
U, 
an 
we 
if 
ity 


738 REINHOLD BAER. 


If X,Y,Z are three distinct groups in ¢ such that X {] Y is neither 
part of U nor of Z, then Z() YS Y(\X SX) Y. But from 
these inequalities one deduces readily that 


The following extreme special case of Theorem 1 is not only of great 
interest in itself, but also convenient for applications. (Another extreme 
special case will be treated in section 9 below.) 


CorRoLLaRy 1. Suppose that the add A is the amalgam of the groups 
in @ and that there exists a subgroup U of the add A with the following 
property: 


(**) If X and Y are different groups in $, then X (| YSU. 
Then the strong associative law is satisfied by A. 


Proof. It is easily verified that A is the amalgam of the groups in ¢ 
together with U; and thus we may assume without loss in generality that U 


belongs to ¢. It is clear that condition (*) of Theorem 1 is satisfied by U 
and ¢; and thus the strong associative law holds in A. 


Remark 3. If we change in condition (**) inequality into equality, we 
obtain just Schreier’s theorem (section 7) which thus reappears as a special 
case of Theorem 1. We note that only a special instance of Schreier’s theorem 
(amalgam of two groups) has been used in our deductions. 

Our next application is mainly devoted to obtaining a theorem which, 
on the basis of 5, Theorem 1, is equivalent to a theorem discovered by H. 
Neumann [1]. For its formulation some concepts are needed. 

Suppose that the add A is the amalgam of the groups in the set @. If X 
is a given group in ¢, then we denote by (X,¢) the subgroup of X which 
is generated by all the subgroups X (\ Y of X for Y in ¢, Y 4X; and we 
denote by (A,¢) the set of all the elements in A which belong to at least 
one (X,¢). 


Lemma 3. If A is the amalgam of the groups in ¢, then A(q) is the 
amalgam of the groups (X,); and (A,¢) ts closed in A. 


Proof. Suppose that 2, y,z are elements in A such that + y =z; and 
suppose that x and y belong to (A,¢#). Since A is the amalgam of the 
groups in ¢, there exists a group G in ¢ such that z+ y =z holds in G. Since 
z is in (A, ¢), there exists an XY in ¢ such that z is in the subgroup (X, ¢). 


—_ 


a — 


( 

( 

( 
(] 
ou 
m 
fo 
an 
the 
sul 
(1 
ant 
on 
(2 
Wi | 
Let 

G 

(3) 
eith 


om 


me 


ips 
ng 


FREE SUMS OF GROUPS AND THEIR GENERALIZATIONS. 739 


If ¥ AG, then z belongs to the subgroup (X,¢) {1 G of X 1 G@ which in 
turn is a subgroup of (G,¢). Thus z belongs necessarily to the subgroup 
(G.¢) of G; and the same holds for y. Hence 2+ y =z belongs to (G, ¢) 
too. This shows that (A,¢) is closed in A and that (A,¢) is the concretion 
of the subgroups (Y,¢). If X AY, then (X,¢) [1 (Y,¢) =X Y which 
is a common subgroup of XY, Y, (X,¢) and (Y,¢); and thus (A,¢) is the 
amalgam of the groups (X,¢) for X in ¢. 


THEOREM 2. Suppose that the add A is the amalgam of the groups in ¢. 


(a) The natural homomorphism of A into its derived manifold is an iso- 
morphism if, and only if, the natural homomorphism of (A,¢) into its 
derived manifold is an isomorphism. 


(b) The strong associative law holds in A if, and only tf, it is satisfied by 
(A, $). 

Proof. It is a consequence of Lemma 3 that (A,¢) is closed in A. 
Hence it follows from 3, Proposition 1 that (A,q@) has Property (III) or 
(IV) whenever A has the corresponding property, showing the necessity of 
our conditions. 

If the strong associative law holds in (A,@), then the natural homo- 
morphism of (A,¢) into its derived manifold is an isomorphism. Thus it 
follows from 5, Theorem 1, (c) that (A, ¢) is a free add of generators of one 
and essentially only one group @ (see also 4, Proposition 1) whenever either 
the third or the fourth associative law is satisfied. We note that it will 
suffice for the first part of our proof to assume that 


(1) (A,¢) ts a subadd of the group G; 


and only during the second part of our proof will there be restrictions imposed 
on our choice of the group G. 


We may assume without loss in generality that 


(2) GNA—(A,¢). 


With this condition (2) in mind we form the concretion K of A and G. 
Let 6 be the set of groups composed of G@ and the groups in ¢. Since 
G{) X = (X,¢) for X in ¢, it follows that 


(3) K is the amalgam of the groups in 6. 


If the elements a,b,c belong to A, and if a+ b—c holds in K, then 
either a, b,c belong to A and a+ 6=c holds in A; or else a,b,c belong to 


er 
U 
U 
ve 
ial 
h, 
H. 
ch 
we 
ast 
‘he 
nd 
he 
ce 
b). 


740 REINHOLD BAER. 


G and a+b=c holds in G. But in the latter case a,b,c belong to 
G{)A=(A,¢). Since (A,¢) is a subadd of G, it follows that a+ b—=c 
holds in (A,¢) and hence in 4; and now it is easy to see that 

(4) <A ws a subadd of K. 

If X and Y are two different groups in 6, then it follows from our 
construction of G that Xf] YG; if neither XY nor Y is G, then 
X(\¥ S(4,¢) SG; thus (**) of Corollary 1 is satisfied by the amalgam 
K of the groups in @ and we infer from Theorem 1 that 


(5) the strong associative law holds in K. 


From (5) we infer the validity of Property (III) in K and in all its 
subadds (8, Proposition 1. (a)). Thus the subadd A of K has Property (IIT), 
and this proves the validity of (a). 

Assume now the validity of the strong associative law in (A,@). Then 
it follows from 5, Corollary 1 that (A,¢) is an independent add of generators 
of one and essentially only one group; and thus we may assume now that 
(A,¢) is an independent add of generators of the group G. Now it follows 
from (2), (4), (5) and Lemma 1 that the strong associative law holds in 
A too; and this completes the proof. 


Remark 4. It is not difficult to see that Theorem 1 is a special case 
of Theorem 2, (b). 


Remark 5. Using 5, Theorem 1 and 5, Corollary 1° it is possible to 
restate Theorem 2 in the following form: 


(a’) The amalgam A of the groups in ¢ is a free add of generators of some 
group if, and only if, (A,¢) is a free add of generators of some group. 


(b’) The amalgam A of the groups in ¢ is an independent add of generators 
of some group if, and only if, (A,¢) is an independent add of generators 
of some group. 


Proposition (a’) is equivalent to Neumann’s theorem (see H. Neumann 
[1], p. 599, Theorem). 


9. Homogeneous Amalgams. Homogeneity of amalgams may be defined 
in a variety of ways. We consider only a fairly strong type of homogeneity. 


DeFINITION. If the add A is the amalgam of the groups in the set 4, 
and if, for every non vacuous subset 6 of , the join of the groups in 6 is a 
closed subadd of A, then A is a homogeneous amalgam of the groups in ¢. 


fc 


th 


in 


( 
(1 
in 
X 
be 
be 
sl 
th 
X 
ass 
ele 
ele 
in 
Y 
col 
in 
an 
ho 
(ii 
X 
sul 
an¢ 
Bu 
of 


to 


ne 


in 


FREE SUMS OF GROUPS AND THEIR GENERALIZATIONS. 741 


It is trivial to construct amalgams which are not homogeneous. The 
following characterization of homogeneous amalgams will be convenient to use. 


LeMMA. The following properties of the amaigam A of the groups in 
the set are equivalent. 


(i) A ts the homogeneous amalgam of the groups in ¢. 


(ii) If 2, y,z are elements in A such that x+ y=—z, and if x belongs to 
X in ¢ and y to Y in 4, then x,y,z belong all to X or they all belong to Y. 


(iii) If X,¥,Z belong to , than YSYNZorZNYSYNYX. 


Proof. Assume first the validity of (i). Suppose that 2, y, z are elements 
in A, that + y =z, that z is in XY, y is in Y. The join J of the groups 
X and Y is closed in A; and so z belongs to J too. This implies that z 
belongs to XY or to Y; and we may assume without loss in generality that z 
belongs to X. There exists a group Z in ¢ such that x + y =z holds in Z, 
since 4 is the amalgam of the groups in ¢. Consequently x and z belong to 
the common subgroup XY {) Z of X and of Z. This implies that y belongs to 
X{)Z too. Thus z,y,z belong to X. Hence (ii) is a consequence of (i). 

Assume next the validity of (ii). Consider groups XY, Y,Z in ¢; and 
assume that and Then there exists an 
element z in Y {] Y which does not belong to Y {) Z; and there exists an 
element z in Z {) Y which does not belong to Y [) X. Thus 2 and z are both 
in Y ; and consequently there exists a well-determined element y in the group 
Y such that z+ a2=—y. On the other hand x cannot be in Z nor z in X, 
contradicting (ii). This shows that (iii) is a consequence of (ii). 

Assume finally the validity of (iii). Suppose that 2, y,z are elements 
in A such that «+ y =z; and suppose that they belong to the groups X, Y 
and Z respectively (in ¢). There exists a group W in ¢ such that 2 . y= 
holds in W, since A is the amalgam of the groups in ¢. It follows from 
(iii) that or Since zx belongs to 
X () W, it follows in the first of these cases that x and y belong to the common 
subgroup Wf] Y of and W, implying that z, y,z belong to YSY; 
and in the same fashion we show in the second case that 2, y,z belong to X. 
But from this fact it is easily deduced that A is the homogeneous amalgam 
of the groups in ¢. Thus (i) is a consequence of (iii). 


THEOREM. Suppose that A is the homogeneous amalgam of the groups 
in the set ¢. 


0 
r 
m 
rs 
at 
VS 
mn 
se 
Ts 
rs 
a 


742 REINHOLD BAER. 


(a) If 61s a nonvacuous subset of , then the join A(@) of the groups in 6 
is the homogeneous amalgam of the groups in 6. 


(b) The strong associative law is satisfied by every A(@) (and in particular 
(c) If A is an independent add of generators of the group G, then A(86), 
for 6 a nonvacuous subset of $, is an independent add of generators of the 
subgroup {A(6)} of G. 

Proof. (a) is an immediate consequence of the preceding Lemma; (b) 


may be deduced from the preceding Lemma in conjunction with 8, Theorem 1; 
and (c) is a consequence of 5, Theorem 3, since A(@) is closed in A. 


UNIVERSITY OF ILLINOIS, 
URBANA, ILLINOIS. 


BIBLIOGRAPHY. 


E. Artin [1] “ Das freie Produkt von Gruppen,” F. Klein: Vorlesungen iiber héhere 
Geometrie. Herausgegeben von W. Blaschke; Berlin, 1926, § 92, pp. 361- 
363. 

[2] “The free product of groups,” American Journal of Mathematics, vol. 69 

(1947), pp. 1-4. 

Grace Bates [1] “ Free nets and loops and their generalizations,’ American Journal of 
Mathematics, vol. 69 (1947), pp. 499-550. 

H. Neumann [1] “ Generalized free products with amalgamated subgroups,” American 
Journal of Mathematics, vol. 70 (1948), pp. 590-625. 

O. Schreier [1] Die Untergruppen der freien Gruppen. Hamburger Abhandlungen, vol. 
5 (1927), pp. 161-183. 


ere 


an 


THE ISOPERIMETRIC PROBLEM FOR MINKOWSKI AREA.* 


By HERBERT BUSEMANN. 


The present paper centers about the following problem: A positive con- 
tinuous function o(w) is defined on all unit vectors uw of H”. To find among 
all simple closed surfaces D in a certain class T and with a given “area” 


f o(u)dS those which bound the maximal (Euclidean) volume V(D). Here 
JD 


dS means the (Euclidean) surface element of D and u the exterior normal 
of D “at dS.” 

For the rather restricted class IT which consists of all convex surfaces 
and the non-convex surfaces whose first partials piecewise satisfy Lipschitz 
conditions, the problem is solved by the isoperimetric inequality 


7D 


where K is the polar reciprocal (with respect to the unit sphere) to the 
boundary of the convex closure of the surface C': o'(u)u; and the equality 
sign holds only for D homothetic to K.t 

The author’s interest in the problem is due to the fact that (*) solves 
the isoperimetric problem for the intrinsic area in Minkowski (or finite 
dimensional Banach) spaces (Section 6). The solution 1s in general not a 
Minkowski sphere, but a new convex surface which proves of great importance 
for the theory of Finsler spaces. The reader interested in these questions 
is referred to the author’s lecture on “The Geometry of Finsler Spaces ” 
which is going to appear in the Bulletin of the American Mathematical 
Society. The present paper shows only that Minkowski’s definition of area 
in Euclidean spaces by means of parallel sets carries over to Minkowski 
spaces if the parallel sets are formed with solutions of the isoperimetric 


problem instead of spheres. 


* Received February 11, 1948; revised January 31, 1949. 

1If uw means the interior normal, then K must be replaced by the surface symmetric 
to K with respect to the origin. 

*For the plane the problem was solved in [3] and the methods of [3] are used 
here in Sections 1 and 5. However, in the plane case the solution of the isometric 
problem reduces after a rotation through 7/2 to the sphere of the dual space: (in Banach 
space terminology), and is therefore not an essentially new geometric object. 


743 


Or U 
n@ 

lar 

6), 

b) 

61- 

69 

of 

ol. 

|| 


744 HERBERT BUSEMANN. 


Using (*) presupposes the validity of the representation fo(u)dS for 
the intrinsic area of a surface of class T in Minkowski spaces. But [2], 
where this area is defined, establishes this representation only for surfaces 
of class D’, and therefore not for all convex surfaces. The final Section 7 
of the present paper fills this gap by proving the much more general 
theorem that fo(u)dS represents the intrinsic area of any rectifiable manifold 
in a Finsler space. For rectifiable manifolds in Riemann spaces fo(u)d8 
also represents the various other areas (of Lebesgue, Radd, Federer, . . .) 
none of which is intrinsic by definition. Thus the present theorem con- 
tributes to the theory of these areas the information that they are intrinsic 
in the rectifiable case. 

The form of (*) indicates a close connection with the Brunn-Minkowski 
Theory, in fact, the area fo(u)dS was first studied by Minkowski in [9, 
§ 27], and (*) follows for convex ( and D directly from Minkowski’s 
inequality (Section 1). For non-convex PD the relation (*) follows from 
Lusternik’s generalization [8] of the Brunn-Minkowski Theorem to non- 
convex sets, see Section 2. However, Lusternik’s method does not yield the 
condition for the equality sign in (*). Therefore Dinghas and Schmidt 
proved in [5] Lusternik’s result again for the special case where ( is a 
sphere, by adapting Kneser and Siiss’ proof of the Brunn-Minkowski Theorem 
(compare [1, pp. 88-90]), and showed how this method leads through an 
elegant additional device to the conditions for the equality sign in (*). 

Sections 8 and 4 of the present paper use the method of [5], at times 
verbally, for arbitrary convex C instead of spheres, ‘and go slightly farther, 
to yield a necessary and sufficient condition for the equality sign in the 
isoperimetric problem for the Minkowski area corresponding to C. 

This result implies (*) for convex (. The passage to arbitrary C 
(Section 5) is accomplished in a surprisingly simple manner by an idea 
which the author called regularity principle? and which seems applicable to 
many similar problems. 


1. Let o(u) be a positive continuous function defined on the unit 
vectors u of E,, and such that the surface C: o1(u)u is convex. The defini- 
tion of o(u) is extended to arbitrary vectors x by 


o(z) . o(x) =0 for 


Then C has the equation o(z) = 1 and o(z) is a convex positive homogeneous 


®*Compare [4]. The author was incautious enough to state in [4], that the present 
problem for non-convex D was hopeless. He was then not aware of the paper [8] by 


Lusternik. 


SUi 


THE ISOPERIMETRIC PROBLEM FOR MINKOWSKI AREA. 745 


function of the first degree in a. Therefore it is supporting function of a 
convex body Z with boundary Kk. It is well known [9, § 8] that K originates 
from C by a polar reciprocity in the unit sphere |w«|—1. If o(2) is 
analytic and DP is any other analytic closed convex surface bounding the 


convex body F, then 


JD 
where (see. [9, § 27] or [1, p. 64]) w is the exterior normal of D “at dS” and 


tw 


is the mixed volume of # and (n—1)-times L. If D and K are arbitrary 
closed convex surfaces, then sequences {Dy} and {Ky} of analytic convex 
surfaces exist, which tend to D and K respectively ([1, section 27]). Con- 
vergence of convex surfaces implies convergence of their supporting planes. 
Since V,(D, K) depends also continuously on D and K ([1, p. 40]) it follows 
from (1) and (2) that for any convex D and K 


(3) f o(u)dS = nV,(E, L) =limp"(| pL | —| 


The integral on the left is extended over all points of D where the tangent 
plane exists, | X | denotes generally the exterior Lebesgue measure of the set 
X, and X +-p¥ means the set consisting of all points of the form x + py 
with ee X and ye Y. For the last part of (3) compare [1,p.47]. The 
number nV,(£, LZ) is called the area of D relative to K and was first studied 
by Minkowski ([9,§27]). The inequality of Minkowski (see [1, p. 91] or 
the next section of this paper) 

and (3) yield 


and the equality sign holds only for ZH homothetic to L or D homothetic to K. 
Therefore 

(5) If C:01(u)u is convex then the surfaces homothetic to the polar 
reciprocal K of C and only these minimize f a(u)dS among all conver 
surfaces D with a given volume | E |. 


2. The discussion of the same problem for non-convex D rests on the 


746 HERBERT BUSEMANN. 


expression which the right side of (3) gives for f ao(u)dS. If H is the 
interior of K, then for any convex F ° 
| E+ pH |=|£+ pL |. 
Denoting the closure of the set X by X, any set X satisfies the relation 
(6) X + pH =X + pH. 


Therefore we define generally as Minkowski area B(M) relative to K of the 
boundary of a set X with |X | < oo, the number 


(7) B(X) = lim inf p*(| X + pH |—|£). 
Because of (6) 
(8) B(X) = B(X). 


For this area the isoperimetric problem can be solved completely. In order 
to formulate the solution adequately we introduce the following notation. 
For any bounded set M with | M | > 0 and a given direction u, let a’ be the 
least upper bound of those d for which the intersection of M with the set 
zu <=d has measure 0 and 0’ the greatest lower bound of the d for which 
M intersects ru = d is a set of measure 0. We call a’ Sau <0’ the essential 
supporting strip W(u) of M normal to u. Obviously W(u) = W(— u) and 
| The intersection 


Mr=M () [I W(u) 


is called the essential part of M. By definition Mg—(M)xz. There is a 
countable set {uv} of directions uw such that [[ W(w) =][ W(w). Since 


=U [M—UN W(w)] 


it follows that 

(9) | 
Corresponding to (4) we are going to prove the isoperimetric inequality : 
THEOREM 1. For any set M with 0 <|U|< o, 

(10) B(M) 2=n| |, 


and the equality sign holds if and only if Mr is homothetic to L and 
B(M) = B(Mz). 


THE ISOPERIMETRIC PROBLEM FOR MINKOWSKI AREA. 747 


Notice that the theorem is trivial for unbounded M since then 
| M+ pH |= for any p, so that B(M) = ©. In the proof we assume 
therefore that M is bounded. We call topological sphere a topological image 
D of the unit sphere |2|—1. If M is the interior of D then Mp = UM. 
Therefore Theorem 1 yields 


(11) Jf C ts convex then the surfaces homothetic to K and only these 
mimmize B(M) among all topological spheres D with a given (positive, 
finite) volume | M | or | M |, where M is the interior of D. 


(10) without the discussion of the equality sign follows from the Brunn- 
Minkowski Theorem | M+ pH which yields 


| M+ pH |—| a |= np | | | 


The relation |X + |"+ p|Y was given for arbitrary 
measurable \ and Y by Lusternik in [8]. For reasons explained in the 
introduction we follow very closely the method of [5] to establish Theorem 1 
and the Brunn-Minkowski Theorem for arbitrary convex Y. With minor 
modifications the present proof works for any Y which contains an open 
set Z with |Z|=—|Y |. 


3. The purpose of the next two sections is then to prove simultaneously 
Theorem 1 and 


THEOREM 2. Jf M is a bounded set and K’ is a bounded ‘convex set’ 
utth interior points, then for any p > 0 


(12) M + pK’ >| p| K’ | 


and the equality sign holds if and only if M is homothetic to K’. 


Note. If either M or K’ is not bounded or K’ has no interior points 
(12) is trivially correct. The conditions for the equality sign are different, 
but equally trivial. 

If H is the set of interior points of K’ then | H|—|K’|, hence 
Theorem 2 follows, because of (6), from the special case where K’ == /1 
and M= M. Instead of (12) we therefore prove 


(13) | M+ pH for closed M. 


Theorem 2 is trivial for |M|—0. For if M contains more than one 
point then | M-+pH|>p|H|. And if M consists of one point, then 
| M+ pH|=p|H| and M is homothetic to H. Therefore we may assume 


e 
e 
e 
t 
a 
d 


748 HERBERT BUSEMANN. 
that | M | > 0, and it means no restriction if we assume in addition that 
M | =| H |, because the general case follows immediately from this case. 

The proof proceeds by induction. The very simple case n —1 is left 
to the reader, and Theorem 2 is assumed to hold for n £4. In Ey, introduce 
rectangular coordinates x, *,Yn-1; let aand b and B) be the minimum 
and maximum of x in M(H). Then z= x28 is both the ordinary and 
the essential supporting strip of 7 or H normal to the «z-axis, but the 
essential supporting strip of J normal to the 2-axis will in general have 
the form Sad’ withhasad 

ForaSd=b («=8=B8) denote by M(d) (//(8)) the intersection of 
the halfspace =d (x8) with M(//), and by m(d) (h(8)) the inter- 
section of M(H) with (x=). Then H(«) =h(a«) =h(B) =0. If 
|X |’ is the (n—1)-dimensional measure of a set XY in a hyperplane, then 


(14) |M(d)|= in m (x) | H(8) = | h(2)| ’dz. 


Following the original idea of Brunn we introduce the function z(x) by the 


relation 


| H(z)| =| M(x)|. 


z(x) is a monotone increasing, absolutely continuous function of x with 
foraSasd 
(15) z(z) =B for 
| fora J’. 


Because of | h(z)|’ > 0 for a << z< B the relations (14) and (15) show that 


foraSa<a@ andl’ 
(16) , 
dz | m(a)| for almost all « with S 0’. 


dx | h(2(x))|’ 
Now construct the set M/* in the strip a + px = d* <b + pB by defining 

its intersection m*(d*) with the plane z = d* as follows: Because d + pz(d) 
is a strictly monotone and continuous function of d, a given d* in [a + pz, 
b + pB] determines exactly one d in [a,b] such that d+ pz(d) =d*. We 
then put 
(17)  m*(d*) =m(d) + ph(z2(d)) if a’ <d <b’ and-.m(d) #0 

= 0 for all other d in [a, b]. 


hy 


THE ISOPERIMETRIC PROBLEM FOR MINKOWSKI AREA. 749 


Since h(z(d)) lies in x=2(d) and m(d) in e=d, the set m*(d*) lies 
really in a—d-+ pz(d) =d*, and since ph(z(d)) CpH and m(d) CM 
the set M* is contained in M+ pH. 

The inductive hypothesis yields 


(18) | m*(a*) |= {| m(d)| p | h(z(a)) | 
for the first set of d’s in (17). The relations (16) and (18) show that for 
almost all 2 in [a’,b’| with | m(x)|’ > 0 
(19) | m*(a*)|’- (1 + pdz/dz) 

But 
(20) (1+ pat) = (14 p)" if p>0,a>0 


and the equality sign holds only for A=1 (compare [5] or [6, p. 61, 
Theorem 64]). (19) and (20) yield 


(21) | m*(a*)|’(1 + p(dz/dx)) =| ma)|’- (1+ 


for almost all x in [a’,b’] with | m(x)|’>0. But (21) holds for all « 
with | m(x)|’ 0. The definition of z(x) and (15) show that | m(x)|’=0 
a.e. in [a,a’] and [b’,b], so that (21) holds a.e. in [a,b]. Hence 


(22) 


| | f | m*(x*)| da 
m* (a*)|’(1 + p(dz/dz) )dx 


= (1+ p)" | (1+ 9)" | | 


a 
This proves the inequalities (10) and (12). We also observe that the 
This implies 
that a’ =a and b’=b, because M* contains no points in x < a’+ pa and 
M + pH intersects any strip a+p%<2<ce in a set of positive measure, 
because m(a) 0. Varying the direction of the z-axis we find 


(23) ‘If the equality sign holds in (12), then Mp = M. 


It is now quite simple to discuss the equality sign in (12), but (23) and 
Theorem 1 contain Theorem 2, because—as pointed out by Dinghas and 
Schmidt—the equality sign in (12) for some p > 0 implies equality for any 
op <p. For if p=p’ +8 then M+ pH = (M+ ’H) + 8H hence 


f 
f 
1 

— 
t | 
) 

e 


750 HERBERT BUSEMANN. 


p|H =| M+ pH |" 


Therefore the equality in (12) implies (for p—>0) equality in (10). 


4. That the equality sign holds in (10) when Mg is homothetic to L 
and B(M) = B(Mz) is trivial. But it is worth noticing that the condition 
B(M) = B(Mz) is not superfluous. For if NV is a sequence of points which 
tends to «© and M=L\JN, then Me—L but B(M) > B(L). Examples 
of W. H. and G. C. Young in [11, pp. 276-281] show that the same effect 
may be obtained with suitable convergent sequences. 


(8) and the formulation of Theorem 1 show that we may restrict our- 
selves for the “only if” part in the Ciscussion of the equality sign to the 


case of closed M. We show first: 


(24) If the equality sign holds in (10) then | m(d)|’>0 for’ <d<V’. 
First let a< d< b, put M(d) am ’, and call M” the intersection of M with 
2=d. Let m’p and my be the intersections of M’+ pH and M” + pH 
with «ed. If P denotes the orthogonal projection of H on «= 0 then the 
intersection of M’ + pH (M”’ + pH) with (a <d) is contained in the 
cylinder with base m’p + pP (mp + pP) and altitude (8 —«)p (the width 
of pH in the direction of the x-axis). Therefore 


(M’ + pH) (L” + pH)| (8—a)p(| m’p + pP |’ + P|’). 
Since M + pH = (M’ + pH) + (M” + pH) it follows that 


M + pH |=|M’+pH|+|M"’+ pH | 
— (B—«)p(| m’p + pP|’+ | mp + pP |’), 
hence because of | M | =| M’|+ | M”|, 
(23) M+ pH |—|M|) |—| 
+ + pH |—|M”|) 
— (B—a)(| m'p + pP |’+| + pP|’). 


Because m’p > m(d), mp) ~ m(d), and m(d) is closed, both m’p + pP and 
m”’, + pP tend for p—>0 to m(d), and it follows from (7) and (25) that 


(26) B(M) = B(M’) + B(M”) —2(B—a)| m(d)|’. 


Assume now that a <d<b’. Then | M’|>0, | M”|>0 and (10) and 
(26) yield 


THE ISOPERIMETRIC PROBLEM FOR MINKOWSKI AREA. 751 
B(M) =n |L| | | —2(8—a)| m(d)|’ 
=n|M | (n-1)/n | 3 | 1/n(| MW’ | (n-1)/n | M | (1-n)/n 
+ | | | — (B—a)| m(d)|’ 
Because of | 
| M’ | | Y | | | > 1, 


so that 
B(M) > n|M| (B—a)| m(d)|’. 


Hence if m(d) = 0 the equality sign cannot hold in (10), which proves (24). 


Putting A=A(x) = | h(2(x))|’/| m(x)|’ it follows from (24) and 
(19) that equality in (10) implies that a.e. in [a’, b’] 


| m*(a*)| "(1+ p(dz/de)) = | m(a)|’(1 + pl 4 
hence (compare (22) ) 
pH|=|M*¥|= | m(x)| dx 
+ p fi m(x)|’[(n —1)AVv@-D + dz 
and 


M+ pH | — 


Now 


M |) = f | m(x) | —1)AV dz. 


(x7) 


and the equality sign holds only for A(z) =1. This can be seen by differen- 
tiation or from the theorem of the arithmetic and geometric mean (see 


[6, p. 17]). 
Therefore (since we assumed | M | =| L |) 


B(M) >n| Ml =n| 


unless A(a) = 1 a.e. in [a’, b’]. 

The absolute continuity of z(x), the equality in (10), and (16) imply 
therefore that z(x) =x+a—da’ fora’ If we place so that its 
center of gravity coincides with that M, then A(x) = | h(z)|’/| m(a)|’=1 
a.e. shows that the essential supporting strip a’ <x)’ of M must be a 
supperting strip of L. 

As a convex body, L is the intersection of its supporting strips. Varying 
the direction of the z-axis we see that MzC UL. If a point peL—Myz 


L 
h 
es 
ct 
r- 
1e 
h 
1 
1e 
1e 
h 
d 
at 


HERBERT BUSEMANN. 


existed, then a sphere S with center p and disjoint from My would exist 
Sf) L|>0 because LZ is convex, hence 
M |. 


because My is closed. Then 
1'L| > |Mz|, which contradicts (9) and | L | = 
This proves that Z and My are homothetic and 


B(Mz) =n| M | | 
if the equality sign holds in (10). Therefore also B(M) = B( Me). 


5. The statement (11) solves the main problem for convex C and those 


D for which f o(u)dS = B(.M). Unfortunately the class of surfaces for 
7D 


which this relation holds, even when o(u) =1, is very restricted. Since the 
best possible results can doubtless not be obtained through using B(M), we 


mention only one class I” for which the relation 


(275 f = BN) 


is very easily established: I” consists of topological spheres of class D’ 
where in addition the set of points in which the first partial derivatives are 
continuous can be covered by a finite number of neighborhoods in which 
these partials satisfy a Lipschitz condition. The class [ is defined to consist 


of IY and all closed convex surfaces. 


(28) If C is convea then the surfaces homothetic to K and only these 


minimize f a(u)dS among all surfaces in T with a given volume. 
D 


From this theorem we derive the corresponding theorem for arbitrary C. 
Let C denote the boundary of the convex closure of C and let o1(w)-w be 
the point of the form A-u, A> 0, on C. Then So(u) hence 


(29) 


for every surface D) in T. We are going to show that the equality sign holds 
in (29) for surfaces D which are homothetic to the polar reciprocal K of C 
with respect to |z|—1. It suffices, of course, to prove this for K itself. 
Every point p of C lies on an (n— 1)-simplex with vertices in C (see 
[1, p. 9]), and therefore on a simplex T(p) of lowest dimension v= n—1 
with vertices on C. Then T(p) CC. This is obvious for v0. For v,>0 
the point p is an interior point of T (considered as set in a v-dimensional 
plane), and a hyperplane through p either contains T'(p) or separates at 


THE ISOPERIMETRIC PROBLEM FOR MINKOWSKI AREA. 753 


least two of the vertices of T(p). A supporting plane z of O through p must 
therefore contain 7’, and since 7 {]C is convex, T(p) CG. 

If v=1 and q,q:,: * +,qv are the vertices of T(p), then p has the 
form 


(30) P= > 0, 
hence, after defining (a) in the same way as o(x) for |x| 1, 
1—o(p) agi) SS eo(gi) =1 or 


31 
o(p) =X «e(qi) 


If g(x) is now interpreted as supporting function of a body L with boundary 
K then (30) and (31) imply that at a point of K where the exterior normal 
is parallel to p, more than one, namely v + 1 linearly independent, supporting 
planes exist, so that A is not differentiable. 

The crucial point is now that on the one hand for any point p of C, 
but not on C, the dimension of T(p) is obviously at least one, and on the 
other hand the points on C and not on C are exactly the points p=a1(u)u 
for which c(u) <o(w). Therefore the preceding discussion shows that 
a(u) =o(u) at every point of K where a tangent plane exists, so that 


(32) was — o(u)d8, 


If V(D) denotes the volume bounded by the surface D in I, then’ 
Theorem 1, (27), (28), (29) and (32) contain the following main theorem: 


THEOREM 3. Jf o(u) is defined for |u| —1, positive and continuous, 


then for any surface D in T 
7D 


where K is the polar reciprocal to the boundary C of the convex closure of C: 
o1(u)u with respect to the unit sphere. The equality holds only for D 
homothetic to K. 


Although this theorem shows that the question whether C is convex or 
not is not of great importance as far as the isoperimetric problem for the 
Minkowski area f{o(u)dS is concerned, there are profound differences 
regarding other properties of this area. This is made clear by the following 
theorem which is due to Minkowski (see [9,§ 27]): 


t 
2 
, 


HERBERT BUSEMANN. 


(33) The function o(x) (or the surface C) is convex if and only if the 
following holds: whenever the convex body L, contains the convex body L, 


then { a(u)dS = f o(u)dS, where K; is boundary of Lj. 
Ky Koa 


An equivalent condition, which will be discussed elsewhere, is that the 
hyperplanes be minimal (that is area minimizing) surfaces. 


6. Theorem 3 leads also to a solution of the isoperimetric problem in 
Minkowski geometry. Let x1, %2,- + -,2, be a rectangular coordinate system 
in a Euclidean space associated with a given Minkowski metric » (for this 
and the following see section 2 in [2]). Then a positive homogeneous func- 
tion f(x) of degree 1 exists, which is positive for «0 and such that 
u(x, y) =f(x«—y). Area and volume are defined only if » is symmetric, 
that is p(z,y) =p(y,z) or f(x) =f(—w). The surface T: f(x) —1 is 
the Minkowskian unit sphere. It bounds the closed convex set 8: f(r) <1. 

If | X | denotes the exterior Lebesgue measure of the set X with respect 
to the associated Euclidean space, then X has the Minkowski measure 


(34) |X | | X | 
where 

o = | S | and w() = (n 2-1) 
that is, o™ is the volume of the Euclidean unit sphere. 


The intrinsic area of a surface in a Minkowski space is defined in terms 
of the one-dimensional metric u, but admits for hypersurfaces D of class D’ 
an integral representation of the form fo(u)dS, which is obtained as 
follows: Let the tangent hyperplane of D at a point where it exists, have 
normal u (because of the symmetry of » or S we need not distinguish 
between interior and exterior normals). The hyperplane normal to wu 
through the origin intersects S in an (nm —1)-dimensional convex set S(w). 
If |X|’ denotes exterior (nm—1)-dimensional Lebesgue measure with 
respect to the associated Euclidean space, then 


a(u) /| §(u)|’. 
The intrinsic area of a topological sphere D of class D’ is then 
(35) Am(D) => a(u)dS 


where dS is, of course, the Euclidean surface element (see the next section). 


— nm 


THE ISOPERIMETRIC PROBLEM FOR MINKOWSKI AREA. vy) 


Since convex surfaces are in general not of class D’ this representation 
has not yet been established for the whole class T, but this gap will be filled 
in the next section. 


The surface C:o-'(u)u is convex. (A proof is found in the author’s 
paper “A theorem on convex bodies of the Brunn-Minkowski type,” Pro- 
ceedings of the National Academy of Sciences, vol. 35 (1949), pp. 27-31). 
If K denotes the polar reciprocal of ( with respect to | «| —1 and Vn(D) 
is the Minkowski volume (34) enclosed by the surface D, then Theorem 3, 
(34) and (35) yield 


(36) Am(D) = 01Vm)/"(D) (K)o. 


The factor o-! appears because K depends on the choice of the coordinate 
system x. But “homothetic” is independent of the coordinate system and 
has a Minkowski meaning. K can be characterized in terms of perpendicu- 
larity in a very similar manner as in the two dimensional case (see [3, p. 867]), 
but this will be discussed elsewhere. 

For a satisfactory Minkowski isoperimetric inequality it is desirable to 
eliminate the factor o-! which depends on the associated Euclidean metric. 
This can be most easily accomplished by choosing among all the surfaces 
homothetic to K the surface K*—o'K. Then 


(37) Vin" (K*) = 


A Minkowskian characterization of K* is obtained by substituting K* 
for D in (36): 
(38) Am(K*) = 2Vm(K*). 


This relation may be considered as the Minkowskian analogue to the Eu- 
clidean fact that no") is the surface of the unit sphere. (The relation 
An(T) =nV,,(T) is in general not correct.) Since the associated Euclidean 
space to a given Minkowski metric is determined up to a non-degenerate 
affine transformation, the class T is Minkowski invariant. Therefore we may 
formulate the solution of the Minkowski isoperimetric problem as follows: 


THEOREM 4. For all surfaces D in T 


Am(D) = /™(D) (K*), 


he 
he 
in 
Lis 
at 
ic, 
is 
1. 
ct 
18 
AS 
h 
u 
). 
h 


756 HERBERT BUSEMANN. 


where K* is homothetic to the polar reciprocal of C with respect to |x| =1 
and An(K*) =nV,»,(K*). The equality sign holds if and only if D is 
homothetic to K*. 

The choice of A* has the great advantage that it permits us to generalize 
the fundamental relation (7) to Minkowskian geometry. It is not true that 
the intrinsic area of D is the limit of p“(| + pS |m—|M|m), where M 
is the interior of D. But 


THEOREM 5. If H* is the interior of K*, then for any surface D in T 
with interior M 


Am(D) = lim p*(| M + pll* | ——— | M | m). 
Proof. For the interior H of K the relations (27), (34) and (35) yield 
o(u)dS = An(D) =lim pH | —| if |) 
D 


= lim M+ pH | m—| | m) 
M po(oH) | | M | 


= lim 


which proves the Theorem because = 


In analogy to (7) we may now define as Minkowski area Bn(X) of the 
boundary of a set X in a Minkowski space the number 


(39) Bn(X) =lim inf X + pH* | »—|X | m), 


and Theorem 1 yields 


THEOREM 6. For any set M 


and the equality sign holds if and only if Mr is homothetic to H* and 
Bw (M) = Bn(Mz). 


7. To make the preceding result valid, (35) must be established for 
convex surfaces. The results and notations of [2] will be used freely. 
Since convex surfaces in a Minkowski space are birectifiable, (35) is con- 
tained in the following general theorem: 


THrorEM 7. If (P,x) is a k-dimensional rectifiable manifold’ in a 
Finsler space F with local rectifiable representation x(u), then 


WwW 


q 

( 
1 
i 
¢ 
I 
0 
0 
( 
T 
Ce 
il 
se 
I: 
pe 


THE ISOPERIMETRIC PROBLEM FOR MINKOWSKI AREA. 757 
Dp 


where o™2) is the term o2Z(™) of [2,(7,6)] (the k’s in 2”) and 


A‘“-) are misprints) and means the following: 


The derivatives dx;/du; exist for a definite u-system and a definite 
a-system almost everywhere (a.e.) in the u-coordinate neighborhood. If the 
matrix = 02;/0u;) has rank then the k-dimensional plane 
in the tangent space é at x(w) spanned by the & vectors 0x/du; intersects the 
set (x, €) =1 in a k-dimensional convex set whose k-dimensional measure 
with respect to € considered as a rectangular coordinate system is 1/2) (w) 
=o)/¢™2) (uw). Since A" (w)du,:--du, is the Euclidean surface 
element of (P,x) at x(w) (see [2,(7.5)]) (40) has the form (35) (but u 
does not have the same meaning), moreover (uw) =0 if d2/du has 
smaller rank than *. Therefore o(*)(w) need not be defined in that case, 
we simply put the whole integrand (uw) A") (uw) = 0. 

Because of the additivity properties of |P,2|, | P,|x%, and the 
integral, the general case can be reduced to the local case where P is a bounded 
closed convex set with positive measure | P|’ in EZ” with rectangular coordi- 
nates Ux) and x(w) moves in one coordinate neighborhood (2) 
of P. We choose these coordinates once and for all, and can then replace 
(w) and (u) by and A(w). We show first that 


(41) | | f o(u)A(u)du. 
P 


The paper [2] contains a mistake regarding the measure | P,2| ,%. 
The relation | P,, 7° | » =2 |2(P:Vix),8}n on p. 254 is in general not 
correct. Therefore the proof that | P,2 |» is monotone does not hold, and, 
in fact, this measure is not monotone. This follows from another repre- 
sentation for | P,« |,” which Prof. H. Federer communicated to the author. 
If P is the union of a countable number of compact sets in some metric 
space, x(p) maps P continuously into a metric space with distance 8, and 
v(2)) denotes the number of (maximal connected) components of the set of 


points p in P with 2(p) =p, then 


(42) | P, 2 | f v(x)d 
a(P) 


where the right side means the integral of v(x) over «(P) with respect to 


I 


758 HERBERT BUSEMANN. 


n-dimensional Hausdorff measure on 2(P). It follows from (42) that 
| P,«|n” is invariant under the change of the parametrization, that is 
[2, (4.8) ] (the proof given in [2] does not hold because it uses mono- 
toneity.) It is easy to see that the integral in (42) is not monotone. 

It is known that (41) holds if it has been established for an Fa subset 
of P, on which the mapping p—zx(p) is one-to-one, compare H. Federer 
and A. P. Morse, “Some properties of measurable functions,” Bulletin of the 
American Mathematical Society, vol. 49 (1943), pp. 270-277, Theorem 3.1 
and Section 5. This reference implies the mentioned reduction for the integral 
over the number N(2)) of points p with z(p) =, but because of the 
rectifiability N(x) v(x) on a set My with | Mo,8|n—0. 

We therefore consider an Fa subset P of P on which p—2(p) is one-to- 
one. It follows from (42) that | P, |,“ —|2(P),8|,. Rectifiability means 
the existence of a constant B such that the distance 8 in F satisfies the 
relation 
(43) d(z(u),z(v)) S[B|u—v| for u,v in P. 


The partials dx;/du; exist therefore a.e. in P and are bounded. Conse- 
quently A(w) exists a.e. and is bounded. o(u) is bounded because P is closed. 

For a given e > 0 there exists a compact subset P’ of P with | P—P’ <« 
such that x(u) is uniformly totally differentiable on P’. By Kolmogoroff’s 
Principle | P— P’, «| p* | P—P’ | B%e hence 


pre. 


The boundedness of A(w)o(u) and (44) show that it suffices to establish 
(41) for P’. 

The uniform total differentiability of x(u) on P’ implies: for every 
point there is a p, >0 such that for ujeP’ S(uo,p1), 2, 
(S(uo, pi) is the set of all points u in E* with | u—uo| < pi). 


(45) —2(u.) = y(t) —y(u2) +2] — | with | 2| <4 
We put 
(46) = (Uo) + Uo) (u) /du. 


If A(uo) 540, then 


* (45) is stated in [7, p. 363] and proved in [10, p. 751]. 


W. 


an 


\ 

( 

| fc 

| 


THE ISOPERIMETRIC PROBLEM FOR MINKOWSKI AREA. 759 


k(u) > 0. 


(47) | y(u1) — y(uz)| = | — 
There also is a > 0 such that for we P’ S(uo; pz) 
(48) =1+4+6 with |6| <e. 
(45) and (47) imply the existence of a ps; > 0, p3 Spi, such that 
(49) | — x(uz)| = k(uo)| ur — Us | for uy P’ 1 S(uo, ps). 
With the notations 
(50) w=a2(u,) —a(u), v=y(th) —y(Uz), D=w/| wl], v| 


we conelude from 


w(|v |—|wl)+]w|(w—v) w 


| 
(45), and (49) that 
(51) |w—t|<2|2|/k for S(uo, ps). 
We put @)(€) = @(a(u),é). Then for | —1 and a suitable p’ > 0 
(52) €) = 1+ 6 with |6| Se when | < p’. 


Because ® is homogeneous in é the relation (52) holds for all E40. If 8 is 
the (Minkowski) distance which corresponds to ®) as integrand, then it 
follows from (52) that 

(53) ) =1+4 with | Se, 

when ps) with a suitable py > 0. Moreover 


| w | Bo(@) — | v| 


@)(w) /®,(v) =1+ 
Since ||w|—/v||S|w—v| it follows from (45) and (51) that 


for a suitable positive p; < ps; 
(54) ,(w)/®)(v) =1+4+ 6 with |6| Se for P’ S(uo, ps). 


Let p(uo) = min (ps, ps, ps), then combination of the results (48), (53) 
and (54) yields for any measurable set QC P’ () S( uo, p(uo) ) 


2 


| 


760 HERBERT BUSEMANN. 


(55) | 2(Q),8| (1 | 2(Q), 8 | (1 + | y(Q), Bo | 
= (1+ 62)** | y(Q), 7 | (Uo) 
= (1 6.) »)a(uy)d 
(1 + 62) (uy) du 


= 6,)** J 


where (2, y) is the Euclidean distance | and | 6;| Se. 

Let P” be the subset of P’ where A(u) 40. There is a sequence of 
spheres S(u‘,p(u‘)) which cover P”. Putting Q@:=8S,P” and generally 
On = 17) (CU Qi) we have P” = |) Q; and Q; [1 Q; =0. Because 

4-1 
the mapping x(w) is one to one on P”, x(Q;) [) (Q2) =0 and the sets 
x(Qi) are measurable, therefore by (55) 


= (1+ 6,)*" Jo du. 
f o(u)a(u) du 


Since by [7, p. 362] or [10, p. 753] 
it follows that 
| | f o(u)A(u) du. 
P’ 


By a method which illustrates convincingly the strength of Kolmogoroff’s 
Principle, Theorem 7% can be reduced to (41) and Section 8 of [2]. As 
before or in [2, Section 8] the general case can be derived from the special 
case where P is a bounded closed convex set in H* and x moves in one 
coordinate neighborhood in F, so that (P,2) is represented in the form 


(57) for uve P. 


As auxiliary space we introduce the #"** with Cartesian coordinates 
Uy and metrize (a portion of) it by the Finsler metric 
5* determined by the integrand 


Then 


(59) (2, = € U) 


( 
( 
Ww 
T 
tl 
if 
0. 
( 
B 
th 
(6 
thé 
(6 
| 


THE ISOPERIMETRIC PROBLEM FOR MINKOWSKI AREA. 761 


is a birectifiable surface in the resulting Finsler space because by (57) 


and (58) 
e| U, — | S e(u2)) S + |. 
The matrix uw) /du is 


Therefore the expressions Af(u) and of(u) corresponding to A(w) and 
o(u) for the surface (59) have the following properties 


(61) A(u) S Af(u) S (A?(u) + &B)? 


where B is a constant which depends only on n, k& and B (because 
| SB). For the expression Af(w) tends therefore to A(w). 
The direction coefficients for the tangent space of (59) at a point wu are 
the k-row determinants formed from (60) and divided by A‘(w). Hence, 
if A(w) 0, the tangent space tends to a k-dimensional space in (&,° - -, &n, 
+,0) and of(u) >o(u). Therefore 


(62) of (u)A&(u) >o(u)A(u) when 0. 
By (58) and (59) 
(63) Ui )e. (a (U2), U2)e] & 8* [a (us), (7 (Us), Ue)o] 
— 5*[ (x (w:), 0), (e(us), )]. 
If the set traversed by (x; u)¢, as u traverses P, is denoted by (z(P),P). 
then by (41) and [2, (8.1) ] | 
(64) LAG | = |(2(P), 8 = f of (u) Af(w) du 
and by (41) 
(65) | P, (2, f o° (uw) A°(u)du = o(u)A(u)du =| |x”. 
Kolmogoroff’s Principle (see [2, p. 248]), (67) and [2, (4.10)] show 
that 
(66) | P, (2, =| P, (4, u)o| =| P, x7. 


Theorem 7 is now a consequence of (41), (64), (65) and (66). 


UNIVERSITY OF SOUTHERN CALIFORNIA. 


762 


[9] 


[10] 


[11] 


HERBERT BUSEMANN. 


REFERENCES. 


T. Bonnesen and W. Fenchel, “ Theorie der konvexen Korper,’ Ergebnisse der 
Mathematik, voi. 3, part 1, Berlin, 1954. 

H. Busemann, “ Intrinsic area,’ Annals of Mathematics, vol. 48 (1947), pp. 
234-267. 

~-——, “The isoperimetric problem in the Minkowski plane,” American Journai 
of Mathematics, vol. 69 (1947), pp. 563-871. 

——-—, “On the problem of Dido,” Studies and Essays Presented to R. Courant 
on his 60th Birthday, New York, 1948, pp. 63-73. 

A. Dinghas and E. Schmidt, “ Kinfacher Beweis der isoperimetrischen Kigenschatft 
der Kugel im a-dimensionalen euklidischen Raum,” Abhandlungen der 
Preussischen Akademie der Wissenschaften, 1943, Math. Nat. Klasse no. 7 
(1944). 

G. H. Hardy, J. E. Littlewood, G. Polya, Jnequalities, Cambridge, 1934. 

A. Kolmogoroff, “ Beitriige zur Masstheorie,” Mathematische Annalen, vol. 107 
(1932). pp. 351-866, 

L. Lusternik, “ Die Brunn-Minkowskische Ungleichnung fiir beliebige messbare 
Mengen,” Compies Rendus (Doklady) de VAcadémie des Sciences de VURSS 

(1935), vol. IIL (Nouvelle Série), pp. 55-58. 

H. Minkowski, “Theorie der konvexen Képer, insbesondre, Begriindung  ihres 
Oberflichenbegriffs,” Gesammeite Abhandlungen, vol. II, Leipzig (1911), 
pp. 131-229. 

G. Nébeling, “Uber den Flacheninhalt dehnungsbeschrinkter Flichen,” Mathe. 

matische Zeitschrift, vol. 48 (1943), pp. 747-771. 
W. H. and G. C. Young, The Theory of Point Sets, Cambridge, 1906. 


|_| | 
[1] 

[3] | 

| 
[5] 

[6] 

[7] 
[8] 
|| | a 
0 
1 
|| 
t] 
f 
il 
al 
al 


APPROXIMATION IN, AND REPRESENTATION OF, CERTAIN 
BANACH ALGEBRAS.* 


By Ricuarp ARENS. 


1. Introduction. The primary purpose of this paper is to extend to 
non-commutative Banach algebras the functional representation theorems 
obtained for certain commutative Banach algebras by M. H. Stone, I. Gelfand, 
I. Gelfand and M. Neumark, and by I. Kaplansky and the writer (See the 
bibliography at the end of this paper. We refer to papers listed there by 
placing the author’s name in brackets). 

The algebras we study, and call BQ*-algebras, are defined in section 5. 
The possibility of treating them by “commutative” methods is due to the 
fact that we have required each f to commute with its f* and ff* to lie in 
the center (cf. 5.02). Our main representation for an arbitrary BQ*-alegbra 
A is essentially this: There exists a compact Hausdorff space Y on which the 
group TF of automorphisms of the quaternions (the orthogonal group) is 
represented as a transformation-group, and A is isomorphic to the ring of 
all continuous quaternion-valued functions on A satisfying the conditions 
of covariance 


= (all a in XY, all in LT). 
(For details see 5.4 and 5.5.) It was shown in [Arens and Kaplansky, 9] 
that a representation without the covariance feature 1.1, for example, as 
functions on the structure space (of maximal two-sided ideals) is generally 
impossible even in the commutative case. 

The present methods evolve from the fundamental strategy of Stone, the 
most important changes being due to the indispensability of the covariance 
condition. From [.Arens II] we predict that all irreducible homomorphisms 
are quaternion-valued ; a commutative-sub-algebra argument based on [ Arens 
and Kaplansky] followed by the extension of maximal ideals shows that the 
representation by functions on the space of irreducible homomorphisms is 
norm-preserving (and, « fortiori, faithful) and that BQ*-algebras are semi- 
simple. 

The greatest tactical problem was that of showing that we obtain all 


* Received May 11, 1948. 
163 


| 
| 
er 4 
al | 
nt | 
| 
ler 
07 
ire 
SS | 
res | 
1), 
he- 


764 RICHARD ARENS. 


functions satisfying the condition of covariance 1.1. The first half of the 
paper is devoted to establishing a suitable generalization, including Stone’s 
generalization, of the Weierstrass approximation theorem. Our investigation 
goes beyond that needed for quaternions, yielding such results as the following 
(cf. 4.2): 
Let X be a topological space. Let M, be the algebra of all matrices of 
degree n with complex entries. Let A be a real linear subalgebra of the 
algebra of all continuous M,-valued functions on VY. Suppose A contains 
f* (the hermitian conjugate) and the real part of the trace of f when- 
ever it contains f. Then a necessary and sufficient condition that we 
should be able to approximate a function g uniformly on any compact 
set by elements from A is that we should be able to do so at any pair of 


points. 


The paper closes with an unrelated item: it is shown that in a Banach 
*-algebra in which || ff* || = || f |]? one cannot have f*——1. This is 
related to, and derives its interest from, an unsolved problem of Gelfand 
and Neumark (cf. 7). 


2. Approximation theorems for lattice-valued functions. The princi- 
pal theorems in this and the following sections are of the following character: 
there are given two sets of functions 1 and B with a common domain .V 


and a common range G@; moreover, 


2.01. for each x and y in X and any f in B there is a g in A such that 


g(x) =f (x) and g(y) = f(y). 


To this we add algebraic and topologic conditions on .Y, G, and A which 
insure that A includes B. 

We abbreviate condition 2.01 by saying that A is bi-approximate to B. 

In this section G will be an abelian topological group which is also 
partially ordered and has defined in it the two lattice-operations satisfying 
the usual laws. The algebraic structure is related to the order by the 
familiar condition that a <b implies c— 6 =c—a for all a, b, c. Finally 
it will be supposed that there is a family of symmetric open intervals, con- 
taining the group zero element 0, which forms a basis for the topology of 
G at 0. Consequently, the lattice operations are continuous in both argu- 
ments at once. A group @ satisfying all these conditions will be called a 
topological lattice-ordered abelian group (cf. Birkhoff]. 


L 
be 


B 


i 
i 
0 
( 
d 
tl 
at 
( 
suc 
fu 
pre 
me 


CERTAIN BANACH ALGEBRAS. 765 


Now a word on the topologization of function spaces. Let G@ be as 
above, let X be a topological space, and let C(X, G) be the family of continuous 
functions on XY with values in G. All the operation of G and the ordering 
of G can be adopted, in a natural way, in C(X,@). But the topology we shall 
introduce into C(X,G) will not be in terms of the ordering and intervals 
in C(X,G@) (except when X is compact), but will be defined as follows: 
for each compact subset K of X and each symmetric neighborhood —e < q < ¢ 
of 0 in G we let (K/e) denote the class of pairs (f, g) of functions in C(X, G) 
for which —e < f(a) —g(x) <e whenever xe K. This gives a uniform 
structure and, a fortiori, a topology (called the k-topology, see [Arens I]) to 
C(X,G@). It is known that C(X,@) is complete in this uniform structure, 
if G@ is complete [Arens I]. 

The following theorem is an exploitation of a lattice-technical device 
used by Kakutani [Kakutani, pp. 1004-5] for a formerly similar purpose. 
It was pointed out to us by Professor Stone (at Harvard, 1942) that this 
device could be used to prove Stone’s theorem (see 2.3 below) in more or less 
the way presented here now. 


2.1. THrorEeEM. Let G bea 


2.11 topological lattice-ordered abelian group. 


Let X be a topological space, and let C(X,@) have the k-topology. Let A 
be a topologically closed subgroup and 


2.12 sublattice of C(X,G@). 


Then a necessary and sufficient condition that A include a given subset 
B of C(X, @) ts that A be bi-approximate (2.01) to B. 


Proof. If A includes B, then obviously, 2.01 holds. We turn therefore 
at once to the sufficiency. 

Let fe B, let the compact subset K of .Y, and let the neighborhood 
(—e,e) of 0 in G, be given. By 2.01, for each 2, ye K we can find 
Jry A such that =f (2). Joy(y) =f(y). There exists a neighborhood 
V(x) of « such that g,,(z) f(z) —e for ze V(x). There exist -, 2m 
such that AK is covered by V(a,),--+,V(a@m). With the corresponding 
functions, form the least upper bound gy = Grmy. From the 
properties of this lattice operation we have g,(y) =f(y), and g,(z) = f(z) 
—e for all ze K. Now we can find a neighborhood W(y) such that 
9v(z) Sf(z) +e for ze W(y). Dualizing the previous compactness argu- 
ment, we select such that K is covered by W(y,),: +, W(4n) 


1 


766 RICHARD ARENS. 


and then form the greatest lower bound g~g,, A-- +A gy, of the corre- 
sponding functions. 

It is not hard to see that —eSg(z) —f(z) Se for all zeK, or 
g—fe(K/e). But this means that f lies in the closure of A. Since A is 
closed, fe A. Hence B is contained in A, as was to be shown. 

In the following, # will always denote the real number system, and A 


the complex system. 


2.2. THrorEM. Let X be a topological space. Let A be a closed linear 
subalgebra of C(X,R) with the k-topology. Then a necessary and sufficient 
condition that A include a given subset B of C(X, PR) ts that A be bi-approzi- 
mate to B. 


Proof. We need only show 2.11 and 2.12. For 2.11, we defipe, as 
usual, a \V b to be the larger of the two numbers a and 3b, etc. To show that 
the lattice operations are limits of polynomials in A, and thus are performable 
in A, we use a device suggested by Lebesgue’s proof of Weierstrass’ theorem 
| Lebesgue ]. 

If (1— (1—?*))* be developed in powers of 1—?#*, it converges 
uniformly to |¢| for |¢|=1. Now let M be a compact subset of Y, let ¢ 
be a positive real number, and let fe A be given. It suffices to consider 
merely the case in which | f(x)| <1 for ee M. Select a partial sum S,,(¢) 
of the series mentioned above such that | S,,(¢) —?| < e for all ¢ of absolute 
value not exceeding 1. Then S,(f)—8S,(0) approximates, | f | uniformly 
to within 2e on M. Now S,(f) —S,(0) belongs to A because when expanded 
there is no constant term. Since A is supposed to be closed in the k-topology, 
the function | f |, with values | f(a)|, thus approached on M, belongs to A. 
From here it is only a step to the maximum- and minimum-operations: We set 


fvg¢gfNg=1/2(f +9—if—g)). 


This proves 2. 2. 

2.3. THeorem [Stone, pp. 466-469]. Let X be a compact Hausdor|f 
space. A necessary and sufficient condition that a linear subalgebra A of 
C(X,R) coincide with C(X,R) itself is that (a) A contains the constant 
functions, (b) given xy in X there is an element f in A for which 
f(x) €f(y) and (ec), A ts complete. 


Proof. The necessity is well known; it follows from Urysohn’s Lemma. 
For the sufficiency, we must show that (a) and (b) imply that A is bi- 
approximate in C(X,R). Let geC(X, Ff), and let 2, ye X. There is an 


th 


sil 


CERTAIN BANACH ALGEBRAS. 767 


fe A such that f(7) ~f(y). By multiplication and addition of suitable 
constants, we can obtain from f an he A such that h(x) = g(x), h(y) =g(y), 
as desired. Now (c) shows that A is closed in C(X,R). Hence we apply 
2.2 and obtain 2. 3. 

A consequence which is closely related to Weierstrass’ theorem itself ; 
and which will be used often in the sequel, will next be presented. 


2.4. Lemma. Let X be a topological space. Let f,,- +, fn be members 
of C(X,R). Let F be « continuous function of n real variables, and let 
F(0,---,0) =0. Then the function g =F (fi,: --,fn) is in the closed 


linear subalgebra A generated by f,,- - +, fn in C(X,R) under the k-topology. 
Moreover, A consists of precisely all such q’s. 


Proof. We show that A is bi-approximate to g. Let «ay be given. 
If fi(xz) = fi(y) for all i then g(x) —g(y), and if -,fn(v) all 
vanish then g(x) = 0 and so g(x) =f: (x), g(y) =f: (y) ; but if some don’t 
vanish, let f;(a) 40: then ¢ can be found such that g(x) =(fi(v) = 9(y) 
=tfi(y). On the other hand, if fi(7) ~fi(y) for some 7, then the deter- 
mninant of the system 


g(v) =sfi(v) + (x) 
g(y) =sfily) + tfF(y) 


does not vanish unless f;(2) or f;(y) does. If one of them vanishes, let it be 
the former, f;(2) =0. If, in addition, g(.) = 0, we can find a ¢ such that 
g(z) =0=tfi(x), g(y) =tfi(y). since fi(y) #0; but if then 
there must be some non-zero f;(2)—in which case we can solve 


g(x) = sfi(x) + tfi(*) 
= sfi(y) + tfi(y), 
since fi(y) 0. 
The converse follows essentially from the completeness of C(X, RB). 
(Cf. [Arens I]). 
In the case of complex-valued functions, we deduce the following results. 
2.5. THrorem. Let \ be a topological space. Let A be a closed real 
(i.e., admitting real scalars) linear subalgebra of C(X,K), with the k- 
topology, and let A contain f* (the complex-conjugate) whenever it contains f. 
Then a necessary and sufficient condition that A include a given subset B of 
C(X.K) is that A be bi-approxrimate to B. 


Proof. Again we show how 2. 11 and 2. 12 can be fulfilled. We partially 


768 RICHARD ARENS. 


order K by saying that r+ pi=0 (r,p real) if r= |p|. If for complex 
numbers r + pi we define the complex number <r + pt> = m + rpi/m where 
m is max(|r|, |p|); and then define, for complex qi, 


qi \ =1/2(G1 + G2 — — 


we obtain two operations which turn out to be the lattice operations consistent 
with this ordering. We leave the verification to the reader (Cf. 3). 

We must now show, after these lattice operations are introduced into 
C(X, K), that A is closed under them. We shall show that if fe A and 
f(x) =r(xr) + then 

<f> =m + rpi/m 


where m(x) = Max (r(x), p(x)), belongs to A. First of all, r and w 
separately belong to A, for r=1/2(f+/*), ip~p—1/2(f—f*). Next. 
p? = (ip) (ip)* belongs to A. By 2.5, A contains F(r, p*) where F(s,?) 
= max (|s|,|¢|*). Hence the function m belongs to A. For a real positive 
u, the function r(w + m)-" belongs to A, by 2.5. Hence rpi(u + m)*e A. 


Now 
rpi/(u+ m) —rpi/m | Su. 


Considering that A is closed, we have rpi/me A. Thus <f>e A, and A isa 
sublattice. This proves 2. 5. ‘ 

From this result, a new flock of corollaries could be deduced; of these 
we mention one, because of its relation to [Dunford and Segal, Theorem 4]. 


2.6. Corotuary. Let X be a locally compact space. Let A be a closed 
real linear subalgebra of Cy(X, K), the class of continuous functions vanishing 
at infinity (see [Dunford and Segal]), and suppose A contains f* whenever 
it contains f. Then A=C (X,K) tf and only if for each x, y 
implies that there is an f in A such that f(x) is real and f(y) is not real. 


The proof is essentially the same as that of 2.3, but about four times 
as long because (a) the hypothesis is applied once as it stands and then 
again with x and y interchanged, and (b) only real scalars are to be used, 
although values may be complex. 

Corollary 2.6, from which Theorem 4 [Dunford and Segal] may readily 
be deduced, differs from it inasmuch Dunford and Segal appear to require 
complex scalar multiplication in A, while in 2.6 this condition is replaced 
by requiring essentially that we can obtain all complex numbers at each 


| 
tl 
h, 
Le 

h 
| We 
If 

(h 

T! 

th 
are 

by 

ho 

by 


h 


CERTAIN BANACH ALGEBRAS. 769 


point y of X. In this form, the results, although not the methods, can be 
carried over to the case of quaternion valued functions, which are treated in 
the next two sections. 

Our final application is the proof of a theorem which the authors of 
[Arens and Kaplansky] neglected to prove in the paper just cited. It can 
be proved by purely ideal-theoretic arguments (as indeed the known special 
cases are proved), but as we need the result in a later section, we take this 
opportunity to illustrate the method of bi-approximation in such a connection. 


2.7%. THrorem. Let X be a locally compact Hausdorff space on which 
is defined an involutory homeomorphism (*). Let A be that subring of 
C(X, K) whose elements satisfy the relation f(x*) =f(x)*, and vanish at 
infinity. Then each homeomorphism H of A into K is determined by precisely 
one point x of X and the formula H(f) =f(x), (f in A). 


Proof. Let the maximal ideal a be the set of functions in A which are 
mapped into 0 by Z/. By introducing a unit into A (if it has none) and 
applying [Gelfand, Satz 4], we discover that a is a closed set of A and thus 
a closed linear subalgebra. If we consider the point at infinity w to be 
adjoined to XY, then C(X,K) can be regarded as having the k-topology of 
C(X + w,K). Suppose that for each x in X there is a g in a such that 
g(x) ~0. If x and y are points of X + w (and we may exclude the case 
that x or y is w), there will be two elements gz and g, such that g,(rz) #0 
~gy(y). By letting hy = Gygy*, we obtain =a > 0, 
hy(y) =0b’ > 0. By suitable scalar multiplications, we can make a’ = b’ = 1. 
Let he(y) =a, h,(x) =b. Unless ab —1, we can find c and d such that 
h=ch,-+ ah, which has the value 1 at 2 and y. If ab) —=1 and a¥1, 
we can find ¢ and d such that h =ch,-+ dh,? has the value 1 at a and y. 
If a1, let h=h,. In any event, we have an h in a such that h{z) =1 
=h(y). Now let f be any element of A. Then hf belongs to a, and since 
(hf) (x) =f(x), (hf) (y) =f(y), we see that a is bi-approximate to A. 
Thus a coincides with A. This contradiction shows that there is an x such 
that feaimplies f(x) —0. The ideal a is clearly not maximal unless there 
are only two (perhaps coincident) such points, z and 2*, related, as indicated, 
by the homeomorphism (*). Since a is the kernel of only two possible 
homomorphisms, which are given by z and 2*, one of these must determine J/ 


by the formula stated. 


Thus the space X is the space of homomorphisms of into K. 


S 
1 
V 


RICHARD ARENS. 


-~? 
o 


3. Preliminary remarks on quarternions and partially ordered groups. 
The methods of the preceding section can be extended to quaternion-valued 
functions by setting g = 0 for a quaternion g if its real part is non-negative 
and its pure part lies in the first quadrant for some fixed choice of axes in 
the system of quaternions, Y. This makes Q a lattice, but the performance 
of lattice operations in C(X,@Q) would require not only the use of f and f* but 
also the non-invariant f', f!, and ff, where by f' for example, we mean the 
function with values —if 

The ordering we have found useful reduces, in the special case of Q, 
to saying if and only if r= 0? + 
This ordering is suggested by the “cone of the future ” of special relativity, 
if the real part 7 is regarded as the “time.” So ordered, @ does not form 
a lattice (although A does: cf. proof of 2.5) because the common part of 
two “cones of the future ” is generally not itself such a cone (cf. [Clarkson] ). 
Geometrical considerations make this evident. 

But a weaker sort of lattice operation is possible, and the proofs of the 
next section will utilize it. Its exposition is simplified by the introduction 
of a singulary operation. We list here some formulae and results which show 
their abstract relationship, and which show how far structures in which they 
can be defined fall short of lattices. For concrete definitions, see the next 
section. 

Let G be a partially ordered abelian group admitting the positive (dyadic) 
rational numbers as operators, with the requirements: a,° a’ =b implies 
c—b=c—a and 1/2(a+a’) = b for any a, a’, b; c in G. Suppose there 
are defined a singulary operation < > and a binary operation Y: then the 


following 3. 01-3. 14 are meaningful. , 1 
3.01.  G is a topological group and VY is continuous. 
3.02. aVYVa=a (idempotence). 
3.038. aVYb=a,b (incidence). 
3.04. a,b Sc implies aV bSc. 
With 3.03 and 3.04, G would be a lattice. 
3.05. 2(a Vb) =2aV 2b. 
3. 06. (a+b) V (a—b) =a+bV (—5S). 
3.07. <a>=aV (—a@). 


3.08. aVb=1/2(a+b+ 


the 


| | 
0 
al 
3 
S} 
te 
G 
is 
W 
as 


CERTAIN BANACH ALGEBRAS. 


3.10. G is a topological group and ¢ > is continuous. 

3.11. <0>—0. 

a1. <= +a. 

3.13. a,b =0 implies a+ b= <a— D>. 

3.14. <2a> = 2<a). 

We now list a number of propositions. We shall require only the last 


one for later use. The other are possibly of independent interest. The proofs 
are very short and are all left to the reader. 


3.15. Lemma. Jn the presence of 3.07 and 3.08, (3.01, 3.02, 3.03) 
is equivalent to (3.10, 3.11, 3.12). 

3.16. Lema. 3.08 with 3.14 implies 3.07. 

3.17. Lemma. 3.05 with 3.06 and 3.07 implies 3. 08. 

3.18. LemMa. 3.08 with 3.13 implies 3. 04. 

3.19. Lemma. 3.04 with 3.07 implies 3. 13. 


3.20. Lemma. Jn the presence of 3.07 and 3.08, @ ts a lattice if and 
only if 3.11, 3.12, and 3.13 hold. 


3.21. Lemma. 3.03 with 3.04 implies 3.02, 3.05 and 3.06 (case of 
a lattice). 


3.22. THEorREM. 3.08 with 3.10, 3.11, 3.12, and 3.14 implies 3. 01, 
3. 02, and 3. 03. 


4. More general approximation theorems. As ranges for the function 
spaces later to be considered, we introduce the following spaces. The qua- 
ternions form a special case. 

Let E be a Banach space with norm || .. ||. Form the direct product 
G=RX E (Cf. [ Banach, p. 181]), which is also a real Banach space. If 
is a real number, and we speak of re G, then we mean the element (7,0). 
Whenever g = (17, ) © @ where re R, pe FE, we will denote these components 
as follows: We partially order by writing 


4.01. g=20 when =| ||; g=q if and only if gq—q, = 0. 


4.02. Lemma. G is a partially ordered abelian topological group (in 
the sense of 8). 


-2 
~? 


RICHARD ARENS. 


Proof. The only non-trivial part of the theorem is quickly reduced to 
the question: if (7, p), (m1, 20, is (r+71,p+p1) 20? We have at 
once r= | pl]. From this r+ || p+ p, || follows, by the 
triangle inequality for the norm in £. 

The singularity operation < > in G@ shall be defined as follows: 

4.03. <g> —= (Max (|q’|, 11 40; 

<0> = <0>. 

4.04. Lema. Conditions 3.10, 3.11, 3.12, and 3.14 hold. 


Proof. The continuity (3.10) of < > is doubtful at most at the origin; 


but it is clearly present at the origin also, since 


|. 


To prove 3.12, first set qg’ =r, =p. Abbreviate max(|1r |, || p ||) 


by m. Then 
lrtm|sm+r, |r—m|sSm—r; 


whence 
and 
| 
or 


(m+ 1, rp/m p) = 0 in G. 


. This proves <qg>=(m,rp/m) =+q. The remaining condition 3. 14 
obviously holds. Thus 4. 04 is proved. 


4.05. Lemma. If one defines 
in G, then this operation is continuous, idempotent, and incidental. 


This results from 3.22 and 4. 04. 

After these preparations, we present our most inclusive approximation 
theorem. 

4.1. Turorem. Let @=R XX E where E is a Banach space with norm 
|---|]. Let X be a topological space. Suppose A is a closed linear subspace 
of C(X, G) with the k-topology such that 


4.11. If f,g belong to A then f’g belongs to A; 


( 

VA 

te 

( 

B 

(| 

ray 

Go 

Iy 

do 

do. 

Tor 

M 

fur 
of 

pre 


CERTAIN BANACH ALGEBRAS. 773 


4.12. If f belongs to A then (| f’ ||?,0) belongs to A (this function has 
ihe values || f(x)” ||?*R); similarly (f,0) belongs to A; 


4.13. The elements having f’ =0 form a real linear algebra, (where 
(f,0)(g’.0) = (f’9,0); then a necessary and sufficient condition that A 
include a given subset B of C(X, G@) is that A be bi-approximate (2.01) to B. 


Proof. The condition is obviously necessary. To prove the sufficiency, 
we will show that for any real positive e, for any compact subset M of X, 
and for any f in B, one can find a g in A such that for any ze M one has 
g(2)=f(2)—e and g(2)’<f(2)’+2e. This yields || g(z)”—f(2)” | 
= 9(2)’—f(z)’ + eS 8e, which shows that g approximates f uniformly on UM. 

In the construction of g we shall use the Y operation ; hence we must now 
show that A is closed under this operation. Again, it is enough to show that 
for he A one has <h}e A. If he A then (h’,0) and (|| h” |/?,0) eA. Now 
(max(!h’ |, | hk” ||),0) is a continuous function of these which vanishes for 
zero values of the arguments; hence, by 2.4 and 4.13, this element belongs 
to A. As for the second component, we observe that for any real positive wu, 
(h’(u + max(| h’ |, || kh” ||) )7, 0) belongs to A, by 2.4 and 4.13 as before. 
By applying 4.11, and then subtracting off the first component, we arrive at 
(0, h’(u + max(| h’ |, || h” |))-*h”). By the same argument used in the proof 
of 2. 5, the limit of this as u approaches zero lies in A. Hence A contains 


<h> = (max(| h’ |, a” ||), (max(, |, || 


Now we proceed just as in the proof of 2.1, until we have found 
Joys’ *>G9emy Which, we recall, have the property that gs v(y) = and 
Yoxy(2) = f(z) —e for each z in M and proper choice of k. We define 
=( (Yaw V Gav) *) V 4. Which way we associate here 
does not matter, but a specific order must be selected since the associative law 
does not hold. By 3. 01 and 8. 02 we have g,(y) = f(y) and g,(z) = f(z) —e 
for each z in K. For each y in M we can find a neighborhood W(y) such that 
gy(z)’ Sf(z)’ +e. By the compactness of K there are Yn such that 
M is covered by +; W(yn). Let us denote the corresponding gy- 
functions by g:,: * °,9n. ‘The desired g is going to be a linear combination 
of these g,. This is where the present proof departs from the method of 
proving 2. 1. 

To construct the coefficients of the linear combination, let us first abbre- 
viate by setting min (9’1,- -,9’n), and let 


T74 RICHARD ARENS. 


Then for each k = 1,- + -,n, the functions 


=1/s-1/(e/n+ —r) —1/n 


are continuous functions of g:,- * *,9n and vanish when these do. By 2.4 
and assumption 4. 13, the elements (c,,0),- - -, (¢n,0) belong to A, whence 


+ (9, 9n)/n. It is apparent now that g lies in A, but we could 
not immediately have inferred it from the formula 


4.13 9n/(e/n + -1/s, 
obtained by collecting terms. For each z in J/, the coefficients 
1/(e/n + —r(2)) 1/(e/n + g’n(z) — V/s(2) 


are positive real numbers whose sum is 1. ‘Therefore, since g:(2),° °°, 9n(2) 
lie in the convex set of G defined by g =’f(z)—e, we conclude g(z)= f(z)—e. 
Continuing, let us set g’:(z) =11,° =n, and let us suppose that 
min (71,° Then s(z) > (e/n + % — 1%)" = n/e, whence 


1/s(z) <e/n. Now 
g’ (z) =1/s(z)-7,/(e/n + 7: — Te) +> + 1/8(2) + tn — Pe) 5 


and by the formula for s(z), 
1% = 1/s(z) e/(e/n-+ — +: + 1/s(2) — te). 
Therefore 
gf (2) —re—=1/s(2) /(6/n — 
+--+ ++ 1/s(z) + 11 — Tr) 


whence 


Since rz = min 9'n(z)) Sf’(z) +e we have g’(z) < + 
as desired. This finishes the proof of 4. 1. 

We now apply this result to rings of matrix-valued functions. If f is 
such a function, then f* will be used to denote the function whose value f* (2) 
is the hermitian transpose f(x)* of f(x), and f* will be called the adjoint of f. 
By rtrf we mean the function having as its value one n-th of the real part 
of the trace of f. 


4.2. Turorem. Let X be a topological space. Let My be the algebra 


O 
pl 
di 
xe 
an 
Lec 
seq 
the 
of 
line 
qu 
con 
mat 
ring 


CERTAIN BANACH ALGEBRAS. 775 


of all matrices of degree n with complex entries. Let A be a closed real linear 
subalgebra of C(X,M,,) with the k-topoiogy, and suppose 

1.21. A contains f* whenever it contains f. 

4.22. A contains rtrv f whenever it contains f. 
Then a necessary and sufficient condition that A include a given subset B of 
C(X, M,) ts that A be bi-approzimate to B. 


Proof. Let EH be the class of matrices in M, whose trace is zero or 
imaginary. ‘The class 2, of real scalar matrices is obviously isomorphic, with 
preservation of linear combinations, to the real number system PR: 


Observing this isomorphism, we can say that M, is isomorphic to & X EL, with 
preservation of real linear combinations. The usual topology of M, is repro- 


duced by taking the “norm” in £ to be 
Pll = (% | pas p= (pis) B. 

For f in A, we have f’e A since f’ is obviously rir f. Since A is a subalgebra, 
isin A ifg is. This shows that 4.11 holds. If fe A, then f—f=—f’eA; 
and || f” ||)? =rir(f’f’*) © A. Thus 4.12 is verified. Finally, 4.13 holds 
lecause A is a real linear subalgebra. An appeal to 4.1 now establishes 4. 2. 

The earlier results on the real (2.2) and complex (2.5) cases are con- 
sequences of this theorem, but oi course the real case is already involved in 
the proof through the mediuin of 2.4. The following statement for the case 


of the quater: ions Q is now possible. 


4.3. Ty .. Let X be a topological space. Let A be a ciosed real 
linear subalge , J(X,Q), with the k-topology, and let A contain f* (the 
quaternion-conjugate) whenever it contains f. Then a necessary and sufficient 
condition that A include a given subset B of C(X,Q) is that A be bt-approxi- 
mate to B. 


Proof. As is well known, the quaternions can be represented by the 


ring of unitary matrices of the form 


a —b* 
(; (a, b complex) ; 


| 
) 
3 


776 RICHARD ARENS. 


and in this representation, the conjugate quaternion is represented by the 
hermitian transpose. Moreover, rtr f= 1/2(f-+ f*) is evidently a member 
of A if f is. Therefore the truth of 4.3 is reduced to that of 4. 2. 

This theorem is used later in the study of certain representations ; but the 
following application has some interest of its own. 

4.4. THeorrEM. Let X be a completely regular space. Let A be a closed 
real linear subalgebra of C(X,Q) with the k-topology, and let A contain 
f* whenever it contains f. Then a necessary and sufficient condition that 
A=C(X, Q) is that, given x,y in X and a quaternion q, there is an f in A 
for which f(x) =0 and f(y) = q. 

Proof. The necessity is a consequence of the complete regularity of 1. 
Therefore let us show that A is bi-approximate to C(A,@Q). Let ge CCX, Q), 
and let z,yeX. There are f, and fy, in A such that = 0, fe(y) = 49(y), 
=9(@), fyu(y) =0. Therefore f =f, -+ f, shares its values at 2 and y 
with g, as desired. 


5. Representation of BQ*-algebras. By a Banach algebra we mean a 
normed ring in the sense of [Gelfand]. except we do not require complex 
scalars nor a unit element. By a BQ*-algebra we mean a Banach algebra A 
in which there is defined an operation (*) which satisfies, for real r, and 
f.gin A 

01. (rf g)* rf® + g*, (fg) i; 

5.02. ff* lies in the center of A, 

5.03. | f i? j ff* + gg* ||, if f and g commute; and the norm itself 
satisfies 

5.04. 

If all conditions except 5.062 are satisfied, we call A a real Banach*- 


ulgebra. The letter “Q” is used to imply 5. 02, which was suggested by the 
quaternions, and the word “ real” is then omitted. 


We prove first a result on commutative Banach*-algebras. 


5.05. Lemma. Let B be a commutative real Banach*-algebra. Let 
fe B. Then 

5.06. || x(f)| = f and a(f*) =a(f) for any homomorphism x of B 
into the complex numbers ; 


al 


fa 
de 
We 


als 


ele 


CERTAIN BANACH ALGEBRAS. i 


5.07. there is a homomorphism x of B into the complea numbers for 


which | x(f)| =f ||; 


5.08. ff* =g! for some g=g*eB, and ff* —|| ff* || has no inverse 
in B, 


5.09. The closure of the principal ideal generated by ff* also contains f. 


Proof. This algebra B can be represented by an algebra A as described 
in 2. % above, in such manner that 


{* (x) = f(x) and If | 


by [Arens and Kaplansky, Theorem 9.1]. Now 5.06, 5.07, and 5.08 are 
obvious consequences of that representation. 

We now prove 5.09, also supposing that B is represented.by an A as 
described in 2.7. We designate the ideal by J and ff* by j. Let a, yeX 
(please refer to 2.7). If f vanishes at x or y, then a scalar ¢ can easily be 
found such that /jf and f agree in value at 2 as well as y. Otherwise 
O <j(x),7(y). and there is an element 


h=[(9(®) I — PUA 


in J. Since h(v) =h(y) =1 it is evident that Af and f agree in value at x 
as well as y. This shows that J is bi-approximate to 7, and by 2. 5, J contains f. 
Let Hom (A, @%) denote the class of homomorphisms of the ring A into Q. 


5.1. Lemma. Let A be a BQ*-algebra. If xe Hom (A,Q) then, for 
feA, 

5.11. | «(f)| S|) (|..! denotes the modulus in Q), 
and 


5.12. a(f*) = a(f)*. 


Proof. First, we observe that in a BQ*-algebra, one has ff* = f*f. In 
fact, f(ff* — f*f) — (ff*) f = f?f* --f(ff*) = 9, from which one can 
deduce that (ff* —f*f)?=0. Replacing f in 5.03 by ff* — f*f, and g by 0 
we see that ff* —f*f—0. Second, we apply 5.06 to the commutative sub- 
algebra A; generated by f and f*, giving 5.11 and 5. 12. 

The next proposition is essentially that Hom (A, @Q) has sufficiently 


many members. 


“5.2. Lemma. Let A be a BQ*-algebra, and lel fe A. Then there is an 
element x in Hom (A,Q) such thal | x(f)| =| f I. 


778 RICHARD ARENS. 


Proof. We may exclude the case f= 0. Case 1: A has a unit element 1. 
Let || f || =r. If +? — ff* has no inverse in A, then, since it lies in the center, 
it lies in some maximal two-sided idea! MW of A. We assert that each g which 
is not in AM has an inverse mod M. In fact, if ge M then gg*eM by 5.09. 
Now the center of A is, mod the maximal ideal J/, a field. Since gg* lies in 
the center, there is an h such that gg*h —1 is in M. This g*h is the desired 
inverse mod M. Evidently A/AM is a division algebra which can, by [Arens 11], 
be only Q, or a subfield of Now x(r?>—/f*) =0, whence 2(ff*) = 1°. 
Application of 5.11 and 5. 12 gives | a(f)| = || f ||. If r? — ff* has an inverse 
in A, then it has an inverse in the closed commutative linear subalgebra a; 
generated by f, f* and 1. But by 5. 08, 7? — f7* cannot have an inverse. 


Case 2: A has no unit. Imbed A isometrically in a ring A, which has a 
unit, after the fashion of [Arens III, Lemma 4]. We do not care whether 
5.03 holds tn A,: at least it holds for the subalgebra A. If +?— ff* has 
no inverse in A;, we arrive at the desired homomorphism «2 of A,, as before. 
But 2 is also a homomorphism of A. But if r?-—-ff* has an inverse s— h 
in A,, then (s—h)(r?—fj*) =1. Since A does not contain 1, sr? = 1, 
and we arrive at an element p==r°h in A for which r°p = ff* + pff*. Apply- 
ing 5. 08 to the commutative closed linear subalgebra A; generated by f and f* 
we obtain that ff* = g‘, where g = g* e A;. Therefore g? is in the center of A. 
Consider the commutative closed linear subalgebra Ag generated by p, p*, and 
Evidently || || and so by the representation of [Arens and Kaplan- 
sky, Joc. cit.], there is at least one homomorphism y of Ay for Which y(g*)= 1°. 
Applying y to the equation 77p = g* + pg* we obtain the contradiction r? = 0. 


Thus the conclusion of 5. 2 is reached in all cases. 


If xe Hom(A, Y), and z is an automorphism of Q leaving F invariant, 
then there is a ye A, such that 


5. 3. = (for all fe A). 


In this case we write y == «av. In this way the group T of automorphisms of ( . 


which leave R fixed acts in a natural way as a group of one-to-one transforma- 
tions on Hom(A,Q). (The fact that « is always an inner automorphism and 
that [ is isomorphic with the orthogonal group is not relevant at this point, 
but is implicitly taken into account in the proof of 5.4.) If we introduce 
into Hom(A, Q) the weak neighborhood topology, then X becomes a Hausdorff 
space, and each fe A may be regarded as a continuous function by setting 
f(x) =x(f). Moreover, 1 becomes a compact space. (We are not omitting 
the 0-homomorphism. But some may prefer to omit it, in which case X is 


t 
0 
t 
a 
u 
i 
is 
9 
Ce 
t 
h 
t 


CERTAIN BANACH ALGEBRAS. 779 


locally compact and 0 becomes the point in infinity.) The proof of these facts 
can be constructed in the same way as the proof of the related case of the unit 
sphere of the conjugate space of a Banach space [Alaoglu], or alternately, 
along the lines of [ Arens III, p. 278]. With this topology, the automorphisms 
act homeomorphically on X. 


5.4. THroremM. Lei A be a BQ*-algebra. Let X = Hom(A,Q) be 
given the weak neighborhood topology. Then X is a compact Hausdorff space. 
For any automorphism « of Q leaving B fixed, set y = ax if 5.3 holds. Then 
21s a homeomorphism of X. If A is imbedded in the natural way in C(X, Q), 
then A ts precisely the ciass of functions, vanishing at infinity, such that 


5. 41, f(ax) =a(f(z)) 


for all a and all x. Moreover, 


5. 42. lf l— | f(x) | 
and 
5. 43. f*(x) =f(x)*. 


Proof. We establish the facts in the reverse order mentioned. First, 5. 43 
and 5. 42 are consequences of 5. 11, 5.12 and 5.2 collectively. From 5. 42 and 
the assumed completeness of 4 we infer that A is a closed real linear subalgebra 
of C(X,Q) with the k-topology. Each f of A satisfies 5.41 because of the 
definition of az = y by 5.2; and each f of A vanishes at infinity.’ Let B be 
the class of functions for which 4. 41 holds, for all « and a, and which vanish 
at infinity. 

We shall show that A is hi-approximate to B. Let x and y be points of 
X, and let fe B. Let f(x) =a, f(y) =b. The image algebra S in Q of A 
under « determines, and is determined by the subgroup aT of those a for which 
az==2x. Since presumably for exactly those a, we infer that ae 8. 
Hence there is a g,¢ A such that g,(7) =a=—f(a). If, furthermore, there 
is an such that y= az, then also baa and g.(y) = gr(av) = 
= aa = 6b; and in this case h = gv. 

If there is no « such that y= aa, obtain first of all a gyeA such that 
Jy(y) =} by the same argument used to find g,. But if yaw for any a, 
the kernel ideals of z and y must be distinct. Being maximal, each must 
contain an element not in the other; and therefore there exist h, and hy such 
that 0, he(y) =0, hy(y) #0, hy(x) =0. By considering hz and 
hy replaced by suitable scalar multiples of hzh,* and hyhy* respectively, we see 
that one could achieve h,(x) =1 and h,(y) =1. In case this paragraph 


780 RICHARD ARENS. 


applies, let h = hig, + hyg,. Tien h(x) and h(y) =b=f(y). 
Such an 4 was obtained also in the preceeding paragraph. Thus A is bi- 
approximate to B, and by 4.3, A coincides with B as asserted in 5.4. The 
remaining assertions of 5. 4 were already considered in the introductory para- 
graph preceding 5. 4. 

The proof of 5. 4 shouid be compared with that of [Arens and Kaplansky, 
loc. cit.]. There, the ring A was imbedded in a larger ring A’ admitting as 
scalars a division algebra (A) which was known in advance to contain all 
possible residue fields; and ther previously established results were applied 
to obtain a representation of A’ as functions on a space XY. From the way .1 
was imbedded in A’ it was then discovered that A consisted of all functions 
satisfying a certain condition of covariance (viz., f(#*) =f(v)*) analogous 
to 5.41. The same attack was also made on certain algebraic algebras, in 
{ Arens and Kaplansky]. And finally, it had to be established @ posteriori that 
X was the space Hom(A,) (in 2.7, above). But here (in the proof of 
5.4), we could not “extend ” the coefficient field of 4 from R to Q. This is 
not only because the new “ scalars ” would not commute with the ring elements 
(pre-factors, post-factors, etc.) but a fundamental and unaesthetic asymmetry 
arises even when one considers the simple case A = KX itself. Formal methods 
(Cayley-Dickson process [| Dickson, p. 15]) lead to non-associative algebras, 
and therefore make a prediction of residue algebras on the basis of [Arens II] 
impossible. Therefore we began with the space Hom(A, Q), or in other words, 
with the space of irreducible (although not inequivalent) representations as 
linear operators on 2-dimensional unitary space. Hence the analogue of 2. ‘, 
desirable for completeness’ sake, was involved from the start. Spaces of ideals, 
and debates about their topologies, are avoided. On the other hand, the proof 
that we obtained all the continuous functions satisfying a condition of co- 
variance (5.41) became enormously complicated, requiring almost all of the 
machinery of sections 3 and 4. 

A smaller distinction between 5.4 and [Arens and Kaplansky, Theorem 
9.1]—that (*) forms a homeomorphism of X in the latter but not in the 
former—should also be observed. 


A characterization of BQ*-algebras can be formulated in the following way. 
5.5. THeoremM. Let X be a compact Hausdorff space on which the 
orthogonal group T acts as a transformation group [leaving a point x, fixed). 


Consider the subalgebra A of C(X,Q) of those functions [vanishing at xo| 
which satisfy 


5. 51. f(ox) =Ja(f(zx)) (for all re X) 


CERTAIN BANACH ALGEBRAS. 781 


where Jq is the inner automorphism of Q associated (in the way discovered 
by Hamilton) with the rotation a Set || f || = sup | f(x)|, and f*¥(v) = 
f(v)* for each fe A. Then A is a BQ*-algebra, (possessing a unit if and 
only if 2 ts isolated). Then X¥ = Hom(A,Q); and all BQ*-algebras are 
obtainable in this way. 


Proof. The only thing that is not either obvious or covered by 5. 4 is, 
that X = Hom(A,Q). But we leave it to the reader to construct the proof 
of this, along the lines of the proof of 2.7%. The necessary argument of bi- 
approximity uses the technique of the demonstration of bi-approximity given 
in 5. 4. 

Concerning the topology of Y = Hom(4A, Q), we can make the following 


statement. 


5.6. TuroremM. Let A be a BQ*-algebra, and let X = Hom(A,Q) be 
given the topology described above. Then X is metrizable if and only if A is 
separable. If X is metrizable, the metric can be so chosen that wt is invariant 
under the homeomorphisms induced by Y. 


The proof of | Gelfand, Satz 12] can be adapted to establish the present 
theorem. The formula for the metric given there, after suitable change of 
notation and interpretation for quaternion-valued functions, yields a metric 
invariant under I. 

We now present a characterization of the space C(.Y, Q) itself, supposing 
X to be compact. This theorem is of interest in itself, but it is not likely that 
the hypothesis could be fulfilled in a ring actually arising in a situation not 
involving the quaternions intimately a priori. 


5.8. THEOREM. A necessary and sufficient condition that a real Banach 
algebra A is isomorphic to C(Y,Q) with 


| f | = sup | f(z)! 
wer 


for some compact Hausdorff space Y is that A be a BQ*-algebra containing 
a subalgebra isomorphic (with preservation of the *-operation) to the 


quaternions. 


Proof. The necessity is obvious (see 5.5). We turn to the sufficiency. 
Let X = Hom(A, Q), and let us regard Q itself as a subset of A, rather than 
merely isomorphic to it (or, what is almost the same thing, select a definite 
one of the isomorphisms which our hypothesis provides). Let Y be the subset 


| 

> 

1 
\ 


782 RICHARD ARENS. 


of XY containing those y for which ge QCA implies y(q)=—g. Y is evi- 
dently compact, and A is homomorphie to a subspace of C(Y,Q). Let fe A 
and suppose || f || 7. Then there is some ae X for which x(ff*) =r*. By 
means of a suitable «eT we have y=arveY. But y(ff*) =r? as well. 


Therefore 
sup | f(¥)| = If 
yeY 


The reverse inequality is obvivous. Therefore A is algebraically and topo- 
logically isomorphic to a subset A’ of C(Y,Q) with the k-topology. A’ clearly 
satisfies the conditions of 4.4, because distinct elements of y have distinct 
kernel ideals in A. Therefore A’ = C(Y,(@), and our theorem is proved. 
Suppose we have a concrete B(*-algebra A of quaternion-valued functions, 
and suppose that a basis in Q can be so chosen that for each function f in 4A, 
there can be found in A also the four components fo, f1, fo, fs of f, where 
f=f.+fi+f.+ fs, to being real-valued, fs, fs; being some real-valued 
functions times i, j, and k, respectively. It follows that A is closed under the 
automorphisms f f > fi, f > where fi(x) = — if(x)i, ete. These auto- 
morphisms have the following abstract properties (among others) : 


5.81. it = jj — identity, ij = ji—f 
5.82. f+ fit f+ 
Conversely, we have the following representation theorem. 


5.9. Turorrm. Let A be a BQ*-algebra in which there are defined 
three automorphisms i, j, and £ satisfying 5.81 and 5.82. Then there exists 
a compact Hausdorff space Y on which acts a group G’ = {v, 7’, k, and the 
identity} of homeomorphisms satisfying the relations 5.81, leaving a point 
yo of Y fixed. A is isomorphic to the subalgebra B of C(Y,Q) determined 
by the relations 


5.91. =f(KY), F(yo) = 9. 


(N.B.: The index j, etc. on a quaternion such as q or f(y) denotes the image 
of that quaternion under the concrete inner automorphism q>— igi; but attached to 
an element f of A it denotes the image under the automorphism abstractly presupposed 
for A in 5.9 and satisfying merely the axioms stated. It should be borne in mind that 
in fi(y) the index has been attached to f and not to f(y).) 


Moreover (using the same notation for B as for A) 


5. 92. Ifl= f(y)| 


i 

( 

i 

( 

a 

t 

ic 

gi 
b 

ne 

as 

ev 

fo 
v 

fol 


CERTAIN BANACH ALGEBRAS. 783 
and 
5.938. f*(y) =f(y)*, Aly) =f = f(y)! 
Finally, yy ts tsolated if and only if A has a unit. 


Proof. We begin by drawing certain consequences from 5. 81 and 5. 82. 
First, if f = f* then f = f'=—fi=— ft. This is obvious; when one replaces f* 
in 5. 82 by f, one obtains that 4f is invariant under all three automorphisms, 
by 5.81. Second, each automorphism commutes with the *-operation: to see 
this, proceed as follows. Replace f in 5.82 by g—g'. Then the left side 
vanishes, yielding g — g' + g*— gi* =0. Now apply i to g + g*, obtaining 
9* =g'-+ g*'. Comparison of these equations gives gi* = 

Define Y as the subset of Hom(A, Q) at which we have ft(y) =f(y)',---, 
'(y) f(y)" @’ is the subgroup of the group of homeomorphisms induced 
by T which is defined by 5. 91. Thus 5. 91 and 5. 93 hold, where y, is the zero- 
homomorphism of 4. We have now only to prove ihat 5.92 holds, which 
implies incidentally that Y does not consist of y, alone (unless A is void). 

Let f be given, and suppose || f There is an in Hom(4, Q) 
such that | f(v)|—r. For any g, if g(x) =0 then gg*(x) =0. Hence 
(g9*)'(x) = g'(a)gi(a)* = 0. Therefore gi(x) —0 if g(a) 
=0; and similar results obtain for j and f. Hence these automorphisms 
induce inner automorphisms of @ at the various points of Hom (A, Q) 
(Cf. [Jacobson, p. 101, Theorem 15]). If at the point z, all three 
automorphisms are extensible to @ as the identity, then 5.82 implies 
that g(x) = g(x)* for all g and hence xe Y. If not, suppose that g'(x) 
~ g(x) for some g. Then we may suppose that for every g in A, g'(zx) 
= p*g(x)p where p is a pure quaternion, for otherwise i? would not be the 
identity. Let p= qiq* for some pure g. Let yeT be the inner automorphism 
g*---q of Q, and let zyx in Hom(A,Q). It is easily verified that 
gi(z) = g(z)'. At the point z consider the inner automorphism of Q induced 
by j. As before, suppose gi(z) —u*g(z)u with some pure wu. Since i does 
not leave every value at z unchanged, we may conclude that 7 itself is obtained 
as one of the values at z. Suppose h(z) =i. Inserting h into 5.82 and 
evaluating at z, we obtain u*iu—-—v1. Therefore, a quaternion v may be 
found such that v*iv =i and u=—vjv*. Let 8 be the inner automorphism 
and let in Hom(A,Q). It is easily verified that g'(y) 
=g(y)'and gi(y) =g(y)i. Consequently y lies in Y, and 5. 92 is established. 

The bi-approximity of A to B is established along the now familiar lines 
followed in 5.4. Thus 5.9 can be established. 

This theorem can also be proved by constructing an algebra A, of ordered 


784 RICHARD ARENS. 


quadruples which (a) contains 4 and ()) admits the quaternions as scalars. 
The automorphisnis i, j, f are used to define the commutation relation between 
the elements of A and those of Y. This mode of proof is very tedious because 
the axioms of BQ*-aigebras for 4, can only be reduced to those for A by long 
calculations. 

At this point it seems appropriate to consider whether every BQ*-algebra 
can not be represented simply as a space C(Y,Q). To put it another way, 
is the covariance feature 5. 51 of 5. 5 sometimes indispensable, or are the con- 
ditions of 5.8 always fulfilled? The fact is, that not every BQ*-algebra is 
representable as a space C(1’,Q). 

We shall describe a general method for obtaining BQY*-algebras which are 
not representable as spaces C(Y,Q). 

Our method is an elaboration of the following principle. 


5.94. Lemma. If A is a subalgebra of C(X,Q) and A ts isomorphic to 
C(Y,Q) for some suitable compact Hausdorff space Y, then there exists a 
continuous mapping of the space X into the orthogonal group TY. 


Proof. Find f, g, and h in A which correspond to the constant functions 
i, j, and k& in the representation of A as C(Y,Q). Now at each z of X, the 
triple f(z), g(x), h(x) can, by a unique orthogonal transformation a(x) 
of the space of pure quaternions, he rotated into the fixed triple of quater- 


nions 1, j, k: 


5. 941. (f(x), 9(@), h(x)) =a(x) (0,7, 


This «(x) depends continuously on x and thus provides the desired mapping 


of X into I. 
The general method of obtaining examples is now to be set forth. 


5.95. Lema. Let X be a completely regular space containing a closed 
set z which can be mupped into T by a continuous mapping B which cannot 
be continuously extended to X. Let A be the subalgebra of those f in C(X.Q) 


defined by the requirement 
5. 951. B(20) (F(2)) = B(2) (F (20) ) (zeZ) 


where 2 is some distinguished point of Z arbitrarily selected in advance. Then 
A admits neither K nor Q as scalars and in particular, is not representable 


as a space C(Y,Q). 


Proof. If A admitted K as scalars, it would be commutative (see Theorem 


| 

( 

a 

0 
SE 
al 
0 
th 
T 
ex 
We 
be 
elk 
“ 
of 
to 
ant 
Th 
ide 
eas 
at 1 


CERTAIN BANACH ALGEBRAS. 


5. 97 below), but 1 is surely not commutative because the complement of Z 
on which the functions in A are subject to no restriction must under the 
hypothesis of inextensibility of 8 be a non-void open set. This shows inci- 
dentally that A isn’t void. If A admitted @ as scalars, then as in 5.94 we 
could find f, g, h in A which would have the abstract algebraic properties of 
the quaternions 7, j, /. We should then set up a mapping 2 as we did in 5, 941. 


For z in Z we would then have 


5. 952. (z) (i, j,k) = (f(z), 9 (2), h(2) 
= B(%)*B(2) (20), 9 (20), (20) ) 


and consequently 


5. 953. B(2) = B(2) 


Therefore B(2.)%(2)%(z)~? would provide the extension of 8 from Z to all 
of XY. This provides 5. 94. 


5.96. THeoremM. There exists a BQ*-algebra which cannot be repre- 
sented by a space C(Y,Q), but modulo each maximal ideal the residue 


algebra is Q. 


Proof. Using the terminology of 5. 95, let Y be the closed dise (0 Sr=1, 
0 = 0 < 2z, in polar coordinates), and let Z be the periphery (r==1). Now 
there exist mappings 8 of Z into T which cannot be extended to all of X. 
This is because the fundamental group of [ is not trivial, and because the 
extension of 8 to all of X would show that the path given by B(z) for zeZ 
was homotopic to the null-path. As an example, we propose B(1,0) eT to 
be that rotation of the 2-sphere which corresponds to the inner automorphism 
) of the quaternions. This gives a path in which punctures the 
“plane at infinity ” an odd number of times, and at any rate a contemplation 
of a suitable polytope will convince the reader that this 8 is not homotopic 
to 0. Hence condition 5. 951 may be taken as the criterion that f belongs to A, 


and read 


5. 561. f(1, 0) = (1, 0) (0S0< 


Then 5.95 assures us that A is not representable as a space C(Y,Q). 

Each point (r,@) with 7 <1 corresponds to one and only one maximal 
ideal, and all points (7,6) with r= 1 determine one additional ideal. It is 
easy to see that this accounts for all maximal ideals since a function vanishing 
at none of these points has an inverse element in A. The values f(a) at any 


786 RICHARD ARENS. 


such z, as f ranges over all elements of 1, evidently exhaust Q, whence the 
residue algebras modulo maximal ideals are all isomorphic to Q. 

The reason for stressing the residue algebras modulo maximal ideals for 
this example is of course that if for any maximal ideal the residue algebra is 
not Q, then the possibility that A is representable as a space C(Y,@Q) is at 
once ruled out. 

We have been unable to determine whether this example can be repre- 
sented by 5. 9, but we doubt it. 

We end this section with a proposition which shows that it would have 
been undesirable to limit one’s self to BQ*-algebras admitting complex scalars, 


5.97%. TuroremM. A BQ*-algebra in which complex scalar multiplica- 
lion ts possible, is commutative. 


Proof. Select any xe X = Hom(A, Q) and an heA such that 7(h) =1 
and h=h*. Since x(ih- th) =—1, w(th) =@q is not a real 
number in QY. Now gz(f) =a(thf) = x(ifh) =x(fih) =2(f)q for any f in 
A. Thus 2(f) lies in the commutative subalgebra generated by g. Hence 
«(fg-—gh) =0 for any f and g. Thus A is commutative. The representation 
of such a ring can therefore be effected by [Arens III, p. 279, Cor. 1]. 


6. Closed ideals in BQ*-algebras. The representation theorems make 
possible a number of statements about the closed ideals in a°BQ*-algebra. 


6.1. THrorEM. In a BQ*-algebra A, every closed ideal is a two-sided 
ideal, and is an intersection of maximal ideals. The closed ideals of A are 
in one-to-one correspondence with the family of those compact subsets of 
X = Hom(A, Q) which are invariant under the homeomorphisms of X induced 
by the automorphisms of @ leaving R fixed (see 5.3). Each closed ideal is 
closed under the *-operation. 


Proof. We shall prove these statements in the reverse of the order stated, 
using 5. 4. 

Let J be a closed ideal. We must show that f*eJ if feJ. However, 
if fe J then ff* «J, and so by 5. 09, J contains f*. 

Let ZC X be invariant under I. Let-Jz be the class of those f in A 
which vanish on Z. Jz is evidently a closed two-sided ideal. 

Conversely, let J be a closed ideal. For definiteness, suppose J is a right 
ideal. Let ZZ, be the class of those x in X at which every element of J 
vanishes. Z may be void, but in any case it is closed and invariant under I. 


as 


\ 

tl 

a 

n 

N 

al 

= 

be 

TT 
th 
so 

bu 

fe 
TI 

ide 
co 
tha 

pro 

the 
her 

side 


CERTAIN BANACH ALGEBRAS. 787 
We shall show that / = Jz, in the notation of the preceding paragraph, by 
showing that J is bi-approximate to Jz, it being clear that J is included in Jz. 

Let x, y in X, and he Jz, be given. 

First, suppose x and y lie without Z. Then there exist f and g in J such 
that AO Ag(y). Now ff*(=f*f) and gg*(=g*g) belong to J also, 
and their values at x and y, respectively, are real and positive. By scalar 
multiplication, we can now obtain f, and g,; such that = 1, g,:(y) =1. 
Now either y= «x for some 2, or not. Consider the latter case. Then we 
can find, as in the proof of 5.4, h, and hy in A such that h,(x) = hy(y) =1 
and hz(y) =hy(x) =0. The element f,h2h + gihyh belongs to J. It has the 
same values at x and y, respectively, as h has. In the former case, h’ = f,h. 


This element lies in J: and 


= fi(e)h(x) =h(z), 
h’(y) =i’ (ax) = a(h'(x)) =a(h(x)) =A(ar) —h(y). 


Second, suppose ye Z. We can dismiss the case that alsove Z. We obtain f, 
as before, and form =f,h in J. Then h’(x) =h(a), and h’(y) =0=—h(y). 

We conclude that J == Jz, by 4.3. 

The correspondence referred to in 6.1 is that between Jz and Z,. It will 
be observed that it is inclusion-reversing, as is to be expected in these matters. 
Therefore, the maximal ideals correspond to the smallest invariant sets, i. e., 
the orbits of the individual points of Y under fr. Since each Z, is a union of 
some of the latter, each Jz is an intersection of the former. 

It is already abundantly clear that each right ideal is also a left ideal. 
but this can also be demonstrated as follows. Let J be a right ideal. Let 
feJ andgeA. Then (gf)* = /*g* lies in J because f* does. Hence gfe J. 
This completes the proof of 6.1. 


6.2. Corottary. Jf X is a compact Hausdorff space, then the closed 
deals of C(X,Q), with the k-topology, are in one-to-one inclusion-reversing 
correspondence with the compact subsets of X. 


This theorem may be extended to completely regular spaces 1. The sets 
that correspond to closed ideals are, however, stili compact. We leave the 
proofs to the reader. And then the results may, of course, be specialized to 
the case of K or R, rather than Q. 

In finite-dimensional BQ*-algebras, all ideals are linear subspaces and 
hence are closed. Therefore, in the finite-dimensional case, all ideals are two- 
sided. On the other hand, one of the simplest infinite-dimensional BQ*- 


788 RICHARD ARENS. 


algebras does have ideals which are not two-sided and hence also, a fortiori, 
ideals which are not closed. 


6.3. LExampLe. If X ts the unit interval, then the BQ*-algebra C (X, Q) 
contains a right ideal which is not two-sided (Cf. [Kaplansky, p. 180]). 


The ideal we wish to exhibit is the principal right ideal J generated by 
the function 
g(x) = x(cos + isin 1/2) o< 1. 


g(0) =0. 


To show that J is not two-sided, we need only attempt to find an f in C(X, ) 
such that jg = gf. If this were possible, a simple calculation shows that, for 
0< 21 we must have 


=j cos 2/2 -— k sin 2/2. 


Hence no definition of f(0) can make f belong to C(X,Q). Therefore 
jg does not belong to J, and J is merely a right ideal. 


7. A remark on Banach *-algebras. This section deals with a detail on 
Banach *-algebras, i.e., Banach algebras A with complex scalars, in which 
there is defined a semilinear operation (*) satisfying 5.01 and also 


7.01. (cf) =c*f* (c complex). 


This section is otherwise unrelated to the preceding theory. Our purpose is 
to prove the following theorem. 


?.1. Turorem. Let A be a Banach *-algebra with a unit, in which 
5.04 holds and in which, for all f, 


7.2. i =i 
There can be no element f in A such that ff* +1=—0. 


Before proceeding to the proof of 7.1, we indicate the origin of this 
question. In [Gelfaud and Neumark, p. 196 (footnote)] it is conjectured 
that from the hypothesis of 7.1 above it follows that ff* + 1 have a unique 
two-sided inverse for every f (in the commutative case, this is known to be 
true). From this conjecture, 7.1 would follow. Hence the conjecture is at 
least slightly substantiated by 7. 1. 


i 

I 
L 

R 

R 


CERTAIN BANACH ALGEBRAS. 789 


Proof of 7.1. We shall show first, that if ff* —-— 1 then the smallest 
circle in the complex plane with center at the origin and surrounding the 
spectrum of f contains the unit circle. For the radius of this circle is given by 


3. =lim || fe 


(Cf. [Arens III, first part of proof of Lemma 2] or, of course, {Gelfand]). 
Now the radius of a similar circle for f* is clearly the same as r, for algebraic 
reasons. By 7. 2, 


fn 


hence 


By letting n approach infinity and observing 7.3, we obtain r21. As a 
uatter of fact, r< || f ||, || f* || = 1 because the spectrum of any element is 
confined to a disc about the origin whose radius is the norm. Consequently 
r=1. 

We proceed to another calculation. Let c be any complex number of 
absolute value 1. Then (f—c)(f*—c*) = ff* —cf*—c*f+1. Since 
if* + we obtain 


Therefore the spectrum of f is within a distance of 2% of every point on the 
unit circle. This contradicts the minimum property already established for 
the unit circle. Thus 7.1 is proved. 


UNIVERSITY OF CALIFORNIA AT LOS ANGELES. 


BIBLIOGRAPHY. 


L. Alaoglu, “ Weak topologies of normed linear spaces,” Annals of Mathematics, vol. 41 
(1940), pp. 252-267. 

R. Arens, (1), “Topologies for spaces of transformations,” Annals of Mathematics, 
vol. 47 (1946), pp. 480-495. 

——., (II), “ Topological division algebras,” Bulletin of the American Mathematical 
Society, vol. 53 (1947), pp. 623-630. 

—— , (III), “Representation of *-algebras,” Duke Mathematical Journal, vol. 14 
(1947), pp. 269-282. 

R. Arens and I. Kaplansky, “ Topological representation of algebras,” Transactions of 
the American Mathematical Society, vol. 63 (1948), pp. 457-481. 


t 


790 RICHARD ARENS. 


G. Birkhoff, “ Lattice ordered groups,” Annals of Mathematics, vol. 43 (1942), pp. 298- 
sal. 
J. A. Clarkson, “ A characterization of C-spaces,’ Annals of Mathematics, vol. 48 
(1947), pp. 845-50. 
L. E. Dickson, Linear Algebras, Cambridge Mathematical Tracts 16, Cambridge (1914). 
N. Dunford and I. E. Segal, “‘ Semigroups of operators and the Weierstrass theorem,” 
Bulletin of the American Mathematical Society, vol. 52 (1946), pp. 911-914. 
I. Gelfand, “ Normierte Ringe,” Recueil Mathématique, N. 8., vol. 9 (51), (1941), pp. 
3-24. 
I. Gelfand and M. Neumark, “On the imbedding of normed rings into the ring of 
operators in Hilbert space,” Recueil Mathématique, N. S., vol. 12 (54), 
(1943), pp. 197-213. 
N. Jacobson, Theory of rings, Mathematical Surveys 2, American Mathematical Society, 
New York (1943). 
S. Kakutani, “ Concrete representation of abstract (IM)-spaces,” Annals of Mathematics, 
vol. 42 (1941), pp. 994-1024. 
I. Kaplansky, “ Topological rings,’ American Journal of Mathematics, vol. 69 (1947), 
pp. 153-183. 
H. Lebesgue, “ Sur l’approximation des fonctions,” Bulletin des Sciences Mathématiques, 
vol. 22 (1898), pp. 278-287. 
M. H. Stone, “Applications of the theory of Boolean rings to general topology,” 
Transactions of the American Mathematical Society, vol. 41 (1937), pp. 


375-481. 


MULTIPLICATIVE DIOPHANTINE EQUATIONS IN 
QUATERNIONS.* 


By E. RosENTHALL. 


1. This note is concerned with the resolution of multiplicative equations 
in integral quaternions. We consider only Lipschitz integral quaternions 
T =, + its + jls + kt, with rational integral coordinates ¢;, where +, j, & 
are the familiar Hamiltonian units. Conjugate and norm of T are defined 
as usual: T=—2¢—T, N(T)=TT it,?; R[T] will represent the 
rational part /, of 7. Except for the quaternion units 7, j, ’, all small Latin 
letters with or without subscripts will denote rational integers; large capital 
letters A,B,--- will represent integral quaternions; their corresponding 
coordinates are designated by dy, d2,d3,a,;, and so on. We say that A is 
even or odd according as N(A) is even or odd, and hence according as 
+ +a, is even or odd. If d2,a3,44) =1, A is said to be 
primitive. The Greek letter + will represent any one of the five even 
quaternions 1+ 7, 1+ /,1+h%, 

The main object of this paper is to obtain the complete integral para- 
metric representation in integral formulas of the equation XY —=ZW. ‘The 
resolution of this equation will be based on Theorems 1 and 2. Theorem 1 
is interesting in itself and generalizes the problem of finding all integral 
quaternions which are doubly divisible by a given integral quaternion [4]. 
Theorem 2 is not new [2, p. 7], but the proof given here shows that the result 
is an immediate consequence of the factorization of primitive quaternions. 


2. TuroreM 1. The complete integral solution of the system 
(1) TW =ZU, N(T) =N(U), T odd and primitive, 
is given by 
(2) W=TX+ YU, Z=XU+4+TY, 
where X and ¥ are arbitrary. 


Substituting in (1) the value of W from (2) we get ZU =TTX + TYU 
=(X0O+T7Y)U,Z=XU+T7Y. Hence it suffices to show that the linear 
system, obtained by equating coordinates in (2), 


* Received June 12, 1948. 


791 


4 


792 E. ROSENTHALL. 


bot, + tats + + + UsYs + = Wi, 
+ + — + — + UsYs = We, 
— — tye + + tory + + UsY2 — — = Ws, 
— + — tot, + + — UsYo + UYs — s = Wi, 


(3) 


is always solvable for integers 2;, y; when 7, W,U satisfy (1). System (3) 
will be solvable in integers if the g.c.d. of all the four-rowed determinants 
in the matrix of the coefficients is equal to the g.c. d. k of all the four-rowed 
determinants in the augmented matrix (J/) [6, p. 10]. Let (a,b,c, d) 
denote the determinant formed from the columns in position a, b,c, d in the 
augmented matrix, and let m= N(T). Then (1, 2,3,4) = m?, and so k can 
only contain factors of m. Also (1, 3, 6,8) = m[m — (u,? + us? + t,? + #,?)] 
= ma and (1,3, 5,7) = m[m — (t,? + t,? + + = mb. Let p| m, 
where p is a rational prime, and let p | a and p |b; then p| a+b, p| t,? + #,”. 
Similarly if p|¢ and p|d in (1, 2, 5,6) = m[m — (t,2 + t.? + uy? + u,) | 
—=me, and in (3,4,5,6) = m[m— (t,°4+ ¢.2+ u,?>+ u.?)] = md, then 
p|t?+t*; and if p|e and p|f in (2,3,5,8) =m[m —(u,? + t? 
+ t,?)] and in (1,4,5,8) = m[m — + us? + t,? + t,?) ] = mf, 
then p|t,?7+ ¢,°. Hence p|2t,2+ N(T), p| ti ((=1,2,3,4), p= +1, 
since T is primitive and therefore &|m. We have only to show now that 
all the four-rowed determinants are divisible by m. 

Note that the elements of the successive columns in (J), taken in order 
of the rows. are respectively the components of the quaternions T, Ti, Tj, Th, 
U, —iU. —jU, —kU, W. We observe that the determinant 


(4) d= 
3 b., 


is equal to m(b,a2 — dib2) + + (BA) i + (tots — tite) (BA); + (tots 
+ t,ts)(BA)x, where (BA), is the coefficient of i in the product BA. Since 
TiT = — im + 2i(t;2 + te?) + 2j (tote — tits) + Qk(tits + tots) it follows 
that 


(5) 2d = — R[TiIT BA] (mod m) or 2d== R[TIT AB] (mod m). 
If B=W and A ~06U or T6, where @ is a unit then 2d=— R&R [TiTWUO] 


= R[TiZU 00] =0 (mod m)or 2d= KR [TiTTOW] =0 (mod m). Hence all 
the four-rowed determinants with columns 1, 2 and 9 are divisible by m. 


ga, 


t, b, 
—t, ts (ly b, 


DIOPHANTINE EQUATIONS IN QUATERNIONS. 793 


Again if B=6,U, A=6.U, then 2d=— &[TiT0,UU6.| =0 mod m) and 
if B=6,U and A= T6, then 2d= &[TiTT6,U6, | =0 mod m) ; hence all 
four-rowed determinants of (Jf) with columns 1 and 2 are divisible by m. 

If in (4) we replace 7 by Tj, (5) remains unchanged except for sign; 
but ths substitution replaces columns 1 and 2 of (J) by column 3 and the 
negative of column 4 of (J/). Hence all four-rowed determinants of (M) 
with columns 3 and 4 are divisible by m. 

In (4) replace 7 by U and change the signs of the elements in the last 
three rows. This gives a four-rowed determinant whose first two columns are 
columns 5, 6 of (JZ), and (5) becomes 


(6) or 2d=-- R[ViU - AB](mod m). 


Put B=W and 4 or TO: then (6) gives 2d=— R[UiV TOW] =0 
(mod m) or 2d= R[CiUWT6] = R[ CiUCZ6] =0 (mod m). If 4 = T6,, 
B=T6., then 2d=R[CiV6.TTO,]=Omodm), and if 4—T6, and 
B=6.U, then (mod m). Hence all four-rowed 
determinants of (J/) containing columns 5 and 6 are divisible by m; similarly 
for the determinants with columns 7 and 8, as is seen by replacing U by — jU. 

If in (4) we replace the first two columns by columns 1 and 3 of (M), 
then d_ satisfies 2d=-—— R[TjT-BA](modm) or 2d=R[TjT- AB] 
(mod m) : and if in (4) we replace the first two columns by columns 1 and 


4 of (MM). then d satisfies 
2d=-—— R[TkT- BA] (mod m) or 2d=R[TKT - AB] (mod m). 


We now proceed with each of these two sets of formulas for 2d as 


we did with (5). If in these formulas we replace successively 7’ by —1T, 
U, —iU. we obtain’ (except possibly for sign) from the first set of formulas 


the values of 2d(mod m) when the first two columns of (4) are respectively 
replaced by columns 2 and 4 of (1), by columns 5 and 7 of (./), by columns 
6 and 8 of (M), and from the second set we get the values of 2d(mod m) 
when these two columns of (4) are respectively replaced by columns 2 and 3 
of (M), by columns 5, 8 of (M), by columns 6, 7 of (1/). Now in each of 
these formulas make the same substitutions for A and B as we made in (5) 
and (6). In all cases we get d=0(modm), and this completes the proof 
that all four rowed determinants are divisible by m, and hence the theorem. 

A result due to Olson[4] giving all integral quaternions doubly divisible 
by a given integral quaternion 7 can be deduced from Theorem 1. 


794 E. ROSENTHALL. 


CoroLtuary 1. All integer solutions of TW =ZT, T odd and primitive 
are given by 
1 = NT + 2a, W=TN + 2a 


where a and N are arbitrary. 


For Z, W satisfy Replace X by Y+ N 
and replace the equal rational integers TY + VT, TY + YT by 2a. 

In the case T even and primitive we can write [3, p. 304] T = Sx where 
S is odd; also there is an integer V such that +W = Vz, where W is obtained 
from V by known sign changes and interchanges of coordinates of V, depending 
on the value of x selected. Since TW =ZT, it follows that SV = ZS, and 
so Z= NS + 2a, V = SN 2a. 


8. THEOREM 2. All odd integral solutions of the equation 
(7) N(X) =uv 


are given by 


(8) X = mAB, u== mAA, v = mBB, 


where m, A, B are odd. 

Let X = pY, Y primitive; then p?YY == uv and we can write p? =ab. 
YY =cd, u=ad, v=cb. Since Y is primitive, c odd and-c | N(Y), then 
[5, Theorem 1, p. 488] Y has a right divisor of norm c, Y = DC, c= CC 
and hence DD=d. Also from we can write p=mnq, a= mn’, 


Hence 
X = mngde, u = mn*DD, v= 


and writing B, A respectively for the products gC, nD gives the required 
result. 

If m, A, B are arbitrary then (8) gives all the solutions of (7); for 
put Y=-7Z, Z odd, then = wv, where u=2u, when N(x) =2, and 
ZZ OY Uv, Where u = 2u,, v = 2v;, u= when is (1 +7)(1+ 7) 
or (1+i)(1++%). Since Z is odd, from ZZ = uv or u,v by (8) we get 
Y u=2mHE or 4mEE respectively, v= mBB. Replacing by 
A gives the required form (8). Finally from 7Z = u,v, we get Y = rmFEF, 

= 2mEE, v—2mFF, which takes the form (8) if we replace 7H by 
A(1+j) and (1+ j)F by B when (1+ 7%)(1+/)), and cE by A(1 + 
and (1+ k)F by B when r= (1+1)(1+ 4). 


di 


DIOPHANTINE EQUATIONS IN QUATERNIONS. 795 


4. We can now reduce the resolution of YY —ZW to that of system 
(1) by using the conditions for quaternions to have common divisors of odd 


norm as given by G. Pall [5]. 
THEOREM 3. All odd integer solutions of 


(9) XY=ZW 


are given by 


X = cSSFPABC, Z=eCCF(UN + PV )RST, 
(10 
=eTTCBA(PU+ VN), W = cBBTSRNE, 

where AA = RR, PP = NN and all parameters are odd except that U and V 


must be of opposite parity. 
Put 
(11) N =wX,, Y W—wW,, 


where \,. Z,, are each primitive. Let & and /, both necessarily odd, 
denote the g.c.d. of the coordinates of the product Y,Y, and the product 
Z,W, respectively. Then [5, Th. 5, p. 491] we can write X,=—YX.K, 
Y,=—KY., N(K) =k where the product X,Y. is primitive; similarly 
W,=LW,., N(L) =1, where the product is primitive. Thus 
(8) becomes ayK = Hence 


(12) ryKK = and 
The solutions of the first equation in (12) satisfy 
a=abe. y=def, z=aen, w=dhe; KK=ghn, LL=gbf, 


and by a repeated application of Theorem 2, since K is primitive, all the 
solutions of KK = ghn are given by K = ABC, g = AA, h = BB, n= CC; 
similarly all the solutions of LL =gbf are given by L = RST, 9 = RR, 
b=SS,f=—TT. Equating the two values of g shows that AJ = RR. Hence 
all the odd .integers satisfving (12) with K and LZ primitive are contained in 


(13) a2=caSS, y=deTT, z=aeCC, w= dcBB; K =ABC, L =RST, 
where AA = RR and all parameters are odd. 
Now consider the second equation in (12). Let m, necessarily odd, 


divide both N(1’.) and V(W.). Then since X.Y. is primitive it follows 
[5, p. 489, Corollary 1’] that any right divisor of norm m of .Y, is also a 


796 E. ROSENTHALL. 


right divisor of norm m of Y», and conversely ; that is, Z,W. and Y, have the 
same right divisors of norm m. Since ZW, is primitive, Z,W. and W, also 
have the same right divisors of norm m; hence any right divisor of norm m 
of Y, is also a right divisor of norm m of W., and conversely. 

Therefore, if =m, we necessarily have Y,— ME, 
W. = NE, (N(M), N(N)) = 1, N(E) = m, and (122) reduces to 1.M = 
Similarly, if (V(X.),N(Z.)) = p, it is necessary that =—FP, Z, = FQ, 
where (N(P),N(Q)) =1, and therefore PM =QN, (N(P).N(Q)) =1, 
(N(M),N(N)) =1, hence N(P) = N(N). Thus the resolution of (12.) 
is reduced to that of 


(14) PM =QN, N(P) =N(N). 
Since P is necessarily odd and primitive, it follows from (14) by Theorem 1 


that M=—=PU+VN. Q=UN+PV. Hence all the odd solutions of 
X2Y2=Z.W>2, both sides primitive, are contained in the set 

(15) X,.=FP, Y,.=(PU+VN)E, 7Z,.=F(UN+ PV). W.=NE, 
where PP = NN, and all parameters are odd except that U and V are of 
opposite parity. Substituting (15) and (13) into (11) and absorbing a into 
F and d into E£ yields (10). 

5. The solution (10) of (9), though complete, is not stated explicitly, 
since the parameters A, R, P, N must satisfy the systems AA = RR, R odd 
and PP —NN, P odd. Each of these systems, which are of the form 
XX = YY, X odd, can be solved explicitly in terms of quaternions by an 
application of Theorem 1. In fact we establish 


THEOREM 4. All solutions of XX =YY are given by 
(16) fX¥ = MU + fY = MU — UN, 
where N is pure and it suffices to take U = Vr. where V is odd and primitive 
and f is 2 or 4. 


Put P= and equation becomes PO = —- QP. Since 
Q is even we can write QQ =eVz, where V is odd and primitive. If also 


N(x) =2 we can put P=Wzr. Hence in this case WV = — VW, and s0, 
by Theorem 1, W—KV+VL and W=—(VK+LV). It follows that 
KV + VL=— (KV+VL), k, = —1, and it therefore suffices to take K 


and pure. Hence VIr), Y =4(MVxr—VIn), where we 
have replaced K + e by M or, since there exists a pure quaternion N satisfying 


DIOPHANTINE EQUATIONS IN QUATERNIONS. 197 


Lx = 7N, replacing Vx by U we can write the solutions as 2X = MU + UN, 
2Y = MU — UN, where N is pure. 

In the case Y = eVz, where N(x) = 4, P will have a divisor of norm 2, 
and proceeding as above we get 4¥ = MU + UN, 4Y = MU — UN, where 
N is pure and we have replaced K + 2e, Vr by M, U respectively. 

The solutions (16) are equivalent to those given by E. T. Bell [1, p. 95] 
in terms of rational integral parameters. For, if we put 


U =a, + ta, + jas + kas, 
2M = 2a — i( bss biz) + — bos) — k (bes bis); 
2N = — — j(d25 + Dis) + k (b23 — bis), 


the solutions (16) take the form 


4-1 
(17) = — + aa; + isi, 
j=1 
(18) fyi = + aai — sj, 
j=1 g=1 


where it suffices to take = bs4(mod 2), = boy(mod 2), = b25(mod 2). 
We now show that it suffices to take f — 2, but we must then give up the 
congruence conditions on 0;;. When f = 4 we have a even and also U = Vz, 
where N(x) =4 and V odd and primitive. Hence all the a; are odd and 
it follows that b,.. 0,,. 6:, are all even or two of them are odd. In the 
former case replace a by 2a, 6;; by 2b;; in (17) and (18) and divide by 2: 
the result is (17) and (18) with f= 2. In the second case let, for example. 
by», hence be odd and let b,,. hence be even: then make the 
replacements a = 2a, = + B13 = 2013 — ay, = + 
Ds4 = 2034 — A102, = 2044. Dos = 2D05. If we divide by 2, we obtain (17), 
(18) with f= 2. 

For the complete solution of XX = YY, X odd, it remains to specify 
the range of the parameters a;. bj; having the property that the expressions 
on the right hand side of (17) (hence, of (18)) become even and their 
sum = 2(mod4). The necessary and sufficient congruence relations for these 
conditions to hold can be stated as follows: 


Let p, q, 7, s be some permutation of 1, 2, 3, 4, and let by; = — dy. 
(i) The a; are not all odd; 


(ii) if ap even, and dg, as odd then bps is even, 


798 E. ROSENTHALL. 


Dor = brs = beg (Mod 2) and if a,=a,= + a, (mod 4) then the remaining 
parameter is given by dg =2 + Dpq + 2Dpr Dps (mod 4) ; 

(iii) if dp, dg even, a,, a, odd, a,==+ a, (mod 4), then bp, + 
Dor + Ups, are even and if a,+a,=0 (mod4) then 2a=2-+ dy, 
-++ Dor + Dps (mod 4), but if a, + a,=2 (mod 4) then 2b), = 2 + Dp, 
Der by, by, (mod 4) ; 

(iv) if a, odd and ay, a,, as even then Dyg, dpr, bps are even and if 
Ug = a, =a, (mod 4) then a==2 + Dpqg + Dp, + bps (mod 4), but if ag=a, 
a, (mod 4) then + Dpr + + 2 (Bas + Dre) (mod 4). 


To obtain (i), note that if a; == + 1 (mod 4), then 
2(> = +(e — bi» — by; — bis) +a— bos — bos) 
+ + bes + a — (O14 + Boy + + d) (mod 4). 


and also that the expressions in each bracket are even, being respectively 
congruent to 2x, (mod 2); hence it suffices to take only the 
positive signs. Therefore > 2;== 2a(mod 2), and so all the a; cannot be odd. 

For (ii), (iii) and (iv) it suffices to obtain the condition for a definite 
selection of p, q, 7, s, since (17) and (18) remain unaltered when any two 
subscripts are interchanged. It suffices to take all the a; not even, since it 
suffices to have U = Vz, V odd and primitive. 


6. As an application of Theorem 2 we can obtain the complete integral 
solution of the multiplicative equation «Y1 —yYY in integral formulas. 
Put X =wW, Y=—2zZ where W and Z are primitive; then this equation 
becomes aw?(WW) = yz*(ZZ) and hence we can write v= ars*c, w = glmn, 
y =aeln?, z = grst, WW = dert?, ZZ =dilm*c. From the last two equations, 
by Theorem 2, W=AGFE, e=FEF, r=FPF, t?=GG, and Z=CKHJ, 
1=HH,c=JJ, m*=KK where d= AA =CC. Hence the complete solu- 


tion can be written as 
(19) Y¥=AGLHHF, Y=CKHFFM, x=aFMMF, 
where we have replaced the products nF, sJ, gA, gC, mG, tk by L, M, A, 


(, G, K respectively. If now we put LH =P, HF =Q, FM =R, AG=S, 
CK =T then all the numbers (19) are included in 


(20) X=SPQ, Y=TQR, «=aRR, y=aPP, where SS=TT. 


By direct substitution we see that all the numbers (20) are solutions of the 
given equation. Hence (20) is the complete solution of a¥¥ =—yYY. 


Vs 


DIOPHANTINE EQUATIONS IN QUATERNIONS. 799 


We conclude with a discussion of the equation X¥? = ZW. The complete 
solution of this equation would be given by (10), where the parameters are 
subject to the condition obtained by equating the expressions for XY and Y. 
This leads to an equation which cannot be handled with facility, but some 
progress can be made if in addition we require that X? be primitive. For if 
X? =ZW, we can write N(X) =klm, N(Z) =kl*?, N(W) =km?, where 
(l,m) ==1. Since X? is primitive, the left divisors of norm kl of X and Z 
are the same, and the right divisors of norm km of X and W are the same: 
hence we can put Y = AP, Z=AQ, X = RB, W = UB, where N(A) = kl, 
N(B) = km, N(P) =N(U) =m, N(R) =N(Q) =I. Therefore PR = QU, 
and by Theorem 1, R= PS + TU, Q=SU + PT where PP=UO. Also 
AP =RB, and R must be a left divisor of norm ] of A; hence A = RV, 
B=VP. Hence all the solutions of Y?— ZW, X? odd and primitive, are 


contained in the set 


(21) X=(PS+TU)VP, Z=(PS+TU)V(SC+PT), W=UVP, 


where PP = UU, and all parameters are odd except that S and T are of 
opposite parity. 

The set (21), however, is not the complete primitive solution since the 
precise range of the parameters necessary and sufficient to isolate these 


solutions is not available. 


THE INSTITUTE FOR ADVANCED STUDY. 


REFERENCES. 


1. E. T. Bell, “ Separable diophantine equations,” Transactions of the American 
Mathematical Society, vol. 57 (1945), pp. 86-102. 

2. L. E. Dickson, “On the theory of numbers and generalized quaternions,” 
American Journal of Mathematics, vol, 46 (1924), pp. 1-16. 

3. G. H. Hardy and FE. M. Wright, An Introduction to the Theory of Numbers, 
Oxford (1938). 

4. H. L. Olson, “ Doubly divisible quaternions,” Annals of Mathematics, vol. 31 
(1930), pp. 371-374. 

5. Gordon Pall, “ On the arithmetic of quaternions,” 7'ransactions of the American 
Mathematical Society, vol. 47 (1940), pp. 487-500. 

6. Th. Skolem, ‘“‘ Diophantische Gleichungen,” Ergebnisse der Mathematik und 


ihrer Grenzgebiete, vol. 5 (1988), no. 4. 


DIVISOR ALGEBRAS.* ' 


By IsraEL NATHAN HERSTEIN. 


Introduction. The object of this paper is to characterize the divisor 
algebras of algebraic function fields of degree of transcendence one over 
algebraically-closed constant fields by a “natural” system of axioms. 

To any algebraic function field of degree of transcendence one over a 
constant field there has been associated a set of objects called the divisors of 
the function field. [1, chapters 2 & 3].?. For these divisors multiplication, 
equivalence and dependence have been introduced so that under this multi- 
plication the divisors form a free Abelian group, and equivalence classes of 
divisors form projective geometries under the dependence. 

We will consider a free Abelian group—the group of divisors—equivalence 
and dependence of these divisors as primitive concepts. These notions will be 
related to each other by a set of simple axioms which, in the standard treat- 
ment of algebraic function fields of degree of transcendence one over alge- 
braically-closed constant fields, are early theorems. We will show that these 
axioms characterize the divisor algebras of such fields. 


1. Before beginning the study of our axiom system we should like to 
study briefly and state some of the results in a concrete case. We suppose 
that we are given an algebraic function field, K, in one variable, over an 


algebraically-closed constant field k. 
A place P is a mapping of K into (4, 0) such that: 


1) P(f) =0 if and only if P(1/f) = . 

2) If P(f)#~ and P(g) ~~~ then P(f+g) =P(f) + P(g) and 
P(fg) = P(f) P(g). 

3) For every a in k, P(a) =a. 


We will write P(f) as f(P). Let Gx be the free Abelian group generated 


* Received July 6, 1948. : 
1The material of this paper comes from a thesis, written under the direction of 


Professor Max A. Zorn, and submitted to the Graduate School of Indiana University for 
the degree of Doctor of Philosophy. The author wishes to take this opportunity to 
thank Professor Zorn for his guidance and encouragement. 

*The numbers in square brackets refer to bibliography. 


800 


rr 


8a 
tk 


t 

A 

we 


id 


d 


DIVISOR ALGEBRAS. 801 


by the places. To every place P we can associate a valuation (logarithmic) 

Vp of K. We say a place P is a zero of f if Vp(f) > 0 and a pole of f if 

Ve(f) <0. For only a finite number of places P is Vp(f) ~0. The 

divisor of a function f is defined to be [J P¥”'”’, and we say that f > JT] P’(). 
P P 


For any divisor D = Il P,”* we define the order of D to be > m,; if each 

i=1 i=1 
m; = 0, then D is said to be integral. We say that A|B (read as A divides ~ 
B) if BAC where C is an integral divisor. Two divisors A and B are 
defined to be equivalent if there exists an fe K with f—> A/B. It follows that 
this equivalence is reflexive, symmetric and transitive. We define § to be the 
set of divisors equivalent to J, the unit element of Gx. § is called the 
principal subgroup and its elements are called principal divisors. The 
equivalent divisors A,,- --,An are defined to be dependent if and only if 
the functions f; > A;/An are linearly dependent over /. By direct checking 
it can be ascertained that the divisors so constructed have the following 
properties : 


2. If A and B are equivalent, then they are of the same order. 

3. The equivalence classes of divisors are projective geometries (in the 
sense of [2 pp. 208-209]) under the dependence. 


4, If A,,- are dependent, then for all divisors VY, A,X,---,AnX 
are dependent. 


5. <A divisor dependent on integral divisors is itself integral. 


6. If A and B are independent integral divisors and if P is any place. 
then there exists a divisor Dp, dependent on A and B, such that P | Dp. 


%. Pappus’ theorem is true in every plane of our geometries. 
8. The transform of a projectivity by a divisor is again a projectivity. 


If A, B, C are three independent points, we shall denote the plane 
generated by them as [A, B, C], and the line generated by any two of them, 
say A and B, by [A,B]. By a marked line we shall mean a line from which 
three distinct points have been selected. If the three points selected are 
A, B,C then the marked line will be denoted by <A, B,C». 

Following Hodge and Pedoe [2, chapter VI], on a marked line <A, B, C) 
we construct a field 4c plus an infinity. In lieu of property 7. above, these 


or 
r 
a 
of 
i- 
of 
"e 
e- 
e 
se 
n 
f 
or 


S02 ISRAEL NATITAN HERSTEIN. 


fields are all commutative. The elements of ®igc will be the quadruples 
(1,5,C:R) where Re <A, B,C>; and where (A,B,C;A) is the zero 
element of ®izc. (A. B) is the unit element of and (A, B,C; C) 
is the infinity. 

Let fi, f2.f;e be linearly independent over and suppose that > A, 
and D. Thus A,C, are independent divisors and so generate 
a plane [.14,C,)]. Suppose further that f, + f.— B. Consider the marked 
line <A, B.C> in the plane [A,C,D]. To every point X in <A, B,C) we 


can assign coordinates in the following two ways: 
1) (A, B,C;X) © 


2) If Af; + then the coordinates of .V are (»,0,A) where A, 
are in k. Thus, since for every non-zero pe k 
p(Af; + pfs) X, p(p. 0, A) = (p, 
We define oinc as follows: for A 0, oinc(p/A) = (A, B,C; X). From 
Hodge-Pedoe [2, p. 270, paragraph 3] o1nc is an isomorphism of & onto ®42G0. 


Thus we have 


THEOREM 1.1. Jf K is an algebraic function field over an algebraically 
closed field of constants k, and if ®aze is the field associated with the marked 
line <A, B,C) in a class of divisors of Ox, then ornc(k) = Pave where orrc 


is an isomorphism which can be given explicitly. 
Definition. A/C and f—1—-B/C then fo <A, 


A natural question would be: given two functions in A and the marked 
lines associated with them, what marked lines are associated with the sum 
and the product of these functions? The answer to these two questions ‘is 


given by the following two theorems. 


TuHeEorEM 1.2. If f, ge K are such that f, 9, 1 are linearly independent 
over k, and if fo <4, B,C> and g@<D,E,F> then f+ CP) 
where R and S are determined in the following manner (Diagram 1). 


Proof. Since f, g, 1 are linearly independent over *, CF, CD and AF 
are independent divisors. Since Se [CH,AF]{ (CD, BF], S/CFe|E/F, 
[D/F, B/C]. Now f>4A/C, f—1—-B/C, g>D/F and g—1 
— E/F ; thus if h then since e |A/C, F/F],h =Xg —1) + ef 
where A, » are in k. But since S/Cé is in [D/F, B/C], h= ag + B(f —1) 
where a, Bek. Since f,g,1 are linearly independent over k, a = B =A—p, 


fol 


| 
| 
| 
a 
; 
| 
| 
| 
| 


DIVISOR ALGEBRAS. 803 


and hence f + g--1—8S/CF. Using the same argument for R/CF we obtain 
that R/CF. Combining the two we have that 

If f, g, 1 are dependent linearly over /&, and fA—g-+uyp, then if 
g =Af -+ om it follows from Theorem 1.1 that o1rc(—p/A) = (A, B,C; R) 


cD 


CF 


DIAGRAM 1. 


and that oanc(— + 1) = (A, B,C; 8) implies that go <R,S,C>. We 
then find that f+ N,C> where ossc(—p/(A + 1) = (A, BL 
and +1) +1) = (A, B,C;N). 

Using the same method of proof as in Theorem 1.2 we readily obtain 


co 


AD CF 


DIAGRAM 2. 


THEOREM 1.3. Jf f.geK and fg¢k and if fe <A,B,Cy and 
<D,E,F> then fg<<AD,M,CF> where M is determined in the 
following manner (Diagram 2). 


We shall return to these considerations again at the end of this paper. 


| 

| 

| 
| 
| 
= 
e | 
a | 
e | CE 

R 
= 
BF 
n 
ff. 
AF 

y 

= 
da | 
m 
is | 

BD 
CE 

| 

F M 
1 
uf 


S04 ISRAEL NATHAN HERSTEIN. 


2. The primitive concepts and axioms. In this section the primitive 
concepts are to be introduced and an axiom system set up to relate these 
fundamental concepts and the objects defined from them. 


I. The first primitive concept to be considered is that of prime divisor. 
We assume that we are given a set of objects which we will call prime divisors 
and will denote by P, Pi, Q, Qi etc. The set of prime divisors will be called 
the Riemann surface, J. 

Let @ be the free Abelian group generated by the prime divisors. We 
identify P with P'. Thus # C G. By a divisor D we mean an element of ©. 
Thus the unit divisor I. of @, is [] P°. 

PeR 


Definition. A divisor D =][P;*% is said to be an integral divisor if 
i=1 
for each i—1,2,---,n, a =O. 
Since @ is Abelian there will be no confusion in writing AB-* as A/B. 
It is clear that every divisor D can be written as 1/B where A and B are 


integral divisors. 


Definition. If A and B are divisors then A|B (read as A divides B) 
if and only if B = AC where C is an integral divisor. 
Two integral divisors A and B will be said to be relatively prime if for 
all prime divisors P the statement P|A and P|B is false. 
n n 
Definition. If D =][ P;* then the order of D), denoted by n(D), is > %. 
4=1 i=1 
Thus n(J) =0 where J is the unit divisor. 
From the definitions of the order of a divisor and the product of two 


divisors we immediately have 


Lemma 1. For all A, Be @, n(AB) =n(A) +n(B) and n(A/B) 
= n(A)—n(B). 


It is clear that if n(D) <0 then D is not integral, and if n(D) =9 
then D is integral if and only if D=—TI. 


II. The next primitive concept that we introduce is that of the principal 
subgroup § of &. A divisor which is in § will be termed a principal divisor. 


Definition. If A and B are divisors then A~B if and only if A/B 
is in §. 


It is trivial that if 4 ~ B then for all divisors V, AY ~ BX. We will 


tl 


bi 


ar 


j 

( 
| 

( 

( 

= 

is 
th: 

. 


ve 
se 


or 


vo 


3) 


DIVISOR ALGEBRAS. 805 


denote the equivalence class of a divisor A by (A). The set of integral 
divisors of (A) will be denoted by {A}. 


III. The third and last primitive concept that will concern us is that 
of dependence relations defined only for equivalent divisors. We suppose that, 
given A,,° * -,A, as equivalent divisors we can say whether or not they are 
dependent. If the A; are not dependent they will be said to be independent. 
If the A, are not equivalent, neither dependence nor independence will be 


defined for them. If A,,---,An, B are dependent and if for some s, 
0<sSn, As, B are dependent, while A,,- - -, As are independent, 
then B is said to depend on A,,: - -, Ay. 


Now that our primitive concepts have been laid out and some definitions 
derived from them, we introduce a system of axioms to intertwine these 
concepts and definitions. 


The axioms are as follows: 


2. If A~B then n(A) =—n(B). 

3. Axioms of dependence: | 

a) If A depends on B,,- +, Bn+,Bn but not on B,,- - +, Ba+, then 
B, depends on B,,- Bas, A. 

b) If A depends on B,,- - -,B, and each B; depends on C,,- - -,Cm, 
then A depends on -,Cm. 


ce) If A,.:--.An are independent and B,,- - -,Bm are independent, 
but A,,- are dependent, then there exists a divisor D 
which depends on >, An and on Bn. 


d) A and B are dependent if and only if A = B. 


4, If A,,---,A, are dependent, then for‘all divisors X, A,X,: +, AnX 
are dependent. 


5. If B depends on A,,: - -,An which are all integral divisors, then B 
is integral. 


6. If A and B are two independent integral divisors, and if P is any 
prime divisor, then there exists a divisor Dp dependent on A and B and such 
that P | Dp. 

The following axiom contains a concept which as yet is not defined. 


r. 
rs 
od 
Te 
ve 
= 

yal 

B 
ill 


806 ISRAEL NATHAN HERSTEIN. 


It will be applied only after the concept of multiprojectivity has been formally 


defined. 


7. The identity multiprojectivity is the only multiprojectivity leaving 


three distinct. dependent points fixed.* 


3. Linear families. In this section we will study some of the prop- 
erties of the dependence relation and some of the immediate consequences 


of the axioms in regard to these relations. 


If A,.: > -.A, is a sequence of equivalent divisors, then by the linear 
family [A,.---.A,] we mean the set of divisors B which depend on 
A,,- If s is the maximum number of independent A;,i=—1,- - -.n, 
then s will be called the dimension of [4,,---,-4,] and denoted by 
dim [A,,- -,A,]. From the abstract theory of linear dependence [3, vol. 1, 


p. 95] the dimension of a linear family is invariant and any subfamily of a 
finite dimensional family is itself finite dimensional. By the multiples of a 
divisor J) in a linear family / we mean the set of divisors Y in F such 
that D | X. 


THEOREM 3.1. are equivalent divisors and for each 
1,2,---,n D|Aj, then D|X for all X in - -, An]. 

Proof. X depends on -,An, hence by axiom 4 depends on 
A,/D,- -,An/D. But since D|A; for i=1,- - -,n, the A;/D are integral 
and thus by axiom 5 X/D is integral. Thus D|YX. 

Corotuary 1. The multiples of D in [A,,- +,An| form linear 
family. 


Corotiary 2. Jf A and B are relatively prime, independent integral 
divisors, then for each prime divisor P there exists exactly one divisor Dp 
in [A, B] such that P| Dp. 


Proof. By axiom 6 there exists at least one such Dp. Suppose that 
there is another one D’p4 Dp. Then by axioms 3a. and 3b. [A, B] 
= [Dp, D’p|. Thus by Theorem 3.1 P/A and P| B which contradicts that 


A is relatively prime to B. 


8 This axiom could be replaced by the following two axioms which are more easily 
verified in the case of a concrete function field: 


a. Pappus’ Theorem. 
b. The transform of a projectivity by a divisor is again a projectivity. 


al 


tl 


all 


] 
( 
I 
| 
is 
| 
° 


DIVISOR ALGEBRAS. 807 


THEOREM 3.2. Jf are equivalent integral divisors and 
dim [A1,° + +,An] =s, then for every prime divisor P the multiples of P 
in [Ay,- + +,An] form a linear family of dimension at least s—1. 


Proof. If P|A; for i=1,---,n then the multiples of P in [A,,---, An] 
hy Theorem 3.1 are exactly [A,---+,An]. Suppose then that Pf A,. Let 
X be any divisor in - An] such that XA A, and Pf X. By axiom 6 
there exists an Xp in [X,A,] such that P|Xp. Since Xp depends on .V 
and A, and is different from both of them, Y depends on A, and X,. Thus 
the multiples of P in the family [4,,---,A,] together with A, generate 
all of [Ay,- An]; hence they form an s—41 dimensional family. 


THEOREM 3.3. For every divisor A, dim {A} is finite. 
Proof. By axiom 5, {A} is closed under dependence. 


By induction over n(A) we have: 
1) If n(A) <0 then {A} is the null set and so the theorem is true. 


2) Suppose the theorem is true for all divisors of order less than °. 
Let n(A) =r. If P is any prime divisor then n(A/P) =1r—1 and so by 
our induction hypothesis dim {A/P}=s, say, where s is finite. Thus 
{A/P} =[B,,: --, Bs]. Suppose that a C and a D could be found in {A} 
such that B,P,: - -,B,P,C,D are independent. Then by Theorem 3.2 the 
multiples of P in [B,P,- - -,B,P,C,D] form a linear family of dimension 


~ at least s-++1. But this implies that dim {A/P} =s = s + 1, which is false. 


Thus dim {A} =s-+ 1. , This concludes the induction. 
THEOREM 3.4. /f A|B then n(A) — dim {A} S n(B) — dim {B}. 


Proof. In the course of the proof of Theorem 3.3-we showed that for 
all prime divisors P, dim {AP} dim {4} +1. Thus if B= AC where C 
is an integral divisor, then applying the result quoted n(C) times we obtain 
the theorem. 


THEOREM 3.5. There exists an A such that dim {A} = 2. 


Proof. By axiom 1, ®=4I. Hence there exists an Re §, RAI. Now 
R = A/B where A and B are distinct integral divisors; since Re §, A~ B. 
Thus dim {A} => 2 from axiom 3d. 


THEOREM 3.6. Jf A and B are independent integral divisors then for 
all positive integers n, A", +, AB"-", B" are independent divisors. 


~ 


h 

h 

n 

al 

P 

at 

at 

ly | 


ISRAEL NATHAN HERSTEIN. 


Proof. Without loss of generality 4 and B can be taken as relatively 
prime. For if A=CX and B= DX where C and D are relatively prime, 


then to check the independence of A”, A"'B.- -, AB", Br, that is of 
Cm it is sutficient, by axiom 4, to check the 


By induction over n we have: 
1) If then by hypothesis are independent. 
2) Suppose the theorem is true for n=r=1. 


Consider - -, AB", B™. By the induction hypothesis, 4’, 


- -, AB™*, B” are independent, hence by axiom 4, A’*!, A’B,- -, ABr 
are independent. So if - -, AB", are dependent, then B™ 
depends on A’B,- --,AB". Therefore from Theorem 3.1, A|B™, 


which is impossible since A and B are relatively prime and A /. 


Definition. If A and B are independent divisors, then [A, B]" = [A’, 


Corotiary. Dim [A, Bl" =n-+ 1. 


THEOREM 3.7. For every integer n there exists a divisor Ay such that 
dim {An} > n. 


Proof. By Theorem 3.5 there exists an .f such that dim {4} = 2. Let 
A and B be two independent divisors in {A}. Then [4,B]"C {A}. But 
dim [A, B]"=n-+ 1, hence dim {A"} >n+1. Let A, =A" 

4, Linear families as projective geometries. We are now going to 
consider the integral divisors of a class, (any finite dimensional linear family 
will do) and show that by the proper definition of points, lines, ete. the 
integral divisors of a class form a finite dimensional, Desarguian projective 
space. As axioms for projective geometry we are using those of Hodge and 
Pedoe [2. pp. 208-209]. It should be noted here that the dimension of a 
linear family that we use here is always one larger than that of Hodge-Pedoe. 


TuEoreM 4.1. The integral divisors of a class form a finite dimen- 


sional, Desarguian projective space. 


Proof. We take the integral divisors of the class to be points of our 
space, and for the Hodge-Pedoe linear spaces S; we take the linear families 
[A,. - + -+.A,] where the A; are independent integral divisors of the class. 


th 


4 


DIVISOR ALGEBRAS. 809 


sy direct checking it is easily seen that these satisfy the Hodge-Pedoe axioms 
for projective geometries. Theorem 3.3 gives us the finite dimensionality. 
There remains but to show the Desargues theorem. 

If dim {A} >3 the space is Desarguian. If dim {A}=—3 then by 
multiplying by an appropriate .Y we can, by Theorem 3.7, obtain a divisor 
AX such that dim {AX} > 3. Since multiplication by X preserves depen- 
dence, it preserves collinearity in the projective space; thus the Desargues 
theorem for {AX} implies the Desargues theorem for {A}. 

By a marked line we shall mean a linear family [X,Y] from which 
three distinct points A, B, C have been selected. We shall denote the line 
marked by A, B, C by <A, B,C». 

Following Hodge-Pedoe [2, chapter VI], on a marked line <A, B, CD 
we construet a field ®4z¢ plus an infinity. The elements of ®4z¢ will be 
quadruples (4, B,C;R) where R is in <A, B,C); and where (A, B,C; A) 
is the zero element of ®inc, (A. B.C:B) is the unit element of ®4gc and 
(.1,B,C:C) is the infinity. 

Given a sequence of divisors which are independent, then we can use 
the hyperplanes »; = +. An] (where n is the dimen- 
sion of the space) as coordinate planes, as is done in Hodge-Pedoe [2, p. 262]. 
Thus we introduce an analytic projective geometry where the equation of any 
hyperplane is linear. Whenever the need arises we shall feel free to use 
analytic projective geometry. 

By a perspectivity o of one line & onto another line B from a point FR ¢ a, 
R ¢ B, we shall mean the mapping defined for every V by o(X) =[X,R] 1 B 
where [X, R] {) B is the point of intersection of [X,R] and B, <A projec- 
tivity is the product of a finite number of perspectivities. We will denote 
projectivities by +. 

ox is defined for all 4 by ox(A) = XA. Very often we will be con- 
sidering the resulting image of a line under a oy and in this case we will 
consider oy as a line-to-line mapping and will not use a different notation 
for it. 

Definition. .\ multiprojectivity is a mapping which maps a line in a 
space A onto a line in a second space B, which can be written as the product 


of a finite number of 7’s and oy’s. 


LemMA 2. /f x, and x, are multiprojectives of <A, B,C> onto <D, EF, F> 
such that 2,(A) = 72(A) = D, (B) = 7.(B) = and x,(C) = 2.(C) = F 


then = 


| 


810 ISRAEL NATHAN HERSTEIN. 


Proof. 2"'z, is a multiprojectivity and it leaves A, B, C respectively, 
fixed. Hence from axiom 7. it is the identity mapping; hence the lemma. 

As a special case of this lemma if 7 is a projectivity taking three collinear 
points into three other specified collinear points, then it is unique. This 
implies Pappus’ theorem in every plane of our geometries, which in turn 
implies the commutativity of the fields ®4z¢ for any <A, B, C>. 


For completeness we will now state a well-known projective theorem. 


THEOREM 4.2. Jf 7 ts a projectivity of <A, B,C> onto <R, 8, T> which 
takes A into R, B into S and C into T, then the mapping +* defined by 
7*(A, B,C; D) =R,S,T:7(D)) is an isomorphism of onto 


We are naturally concerned with the relationship between ®4nc¢ and 
®4x,ex,cx for any given .\. Using an argument which is of the same nature 


as that used in proving Theorem 4.2 we have. 


THEOREM 4.3. Jf ox: <A, B,C) > <AX, BX, CX) such that ox(R) = XR 
then the mapping o*x defined by o*x(A, B,C; R) = (AX, BX, CX; RX) is an 
isomorphism of onto ®ax,px,cx. 


Combining Theorems 4.2 and 4.3 we immediately have 


THEOREM 4.4. Jf wanc®S? is a multiprojectivity of <A,B,C> onto 
<R.8,T> which takes A into R, B into S and C into T, then the mapping 
defined by B,C; D) = (R, 8S, Ts ts an 


isomorphism of onto ®rsr. 


Given two marked lines <A, B,C> and <R,S8,7> then the mapping 
o(X) =XR/A brings the first line into the plane of the second. Then from 
projective geometry we can find a projectivity that will take BR/A into 8, 
CR/A into T and RP into itself. Thus we can find a multiprojectivity which 
will take one marked line onto any other. Thus, from Theorem 4. 4 we have 
that the fields constructed on any two marked lines are isomorphic, and that 


an isomorphism can be explicitly exhibited. 


5. The function field. 


THEOREM 5.1. Jf A and B are independent integral divisors, then for 
every prime divisor P there exists an integer n>0 and a divisor Dp in 
[A. B] such that P"|Dp and if P"|X in [A, B], then X = Dp. 


Proof. Let A = A’S and B = B’S where 8 is an integral divisor and A’ 
and B’ are relatively prime integral divisors. Thus [A, B] = [A’S, B’S] 


7 
t 
a 
ir 


DIVISOR ALGEBRAS. 811 


= 8[A’, B’]. Applying Corollary 2, Theorem 3.1 to [A’, B’] we obtain a 
unique D’p in [ A’, B’], such that P|D’p. Let Dp = D’pS. 

Thus to every line of integral divisors, [A,B], and to every prime 
divisor P we can associate a unique Dp in [A,B]. This gives us a function 
mapping the Riemann surface # onto any line [A, B] of integral divisors. , 
As will be indicated, the mapping can easily be extended to any line [C, D]. 
We define our mapping precisely and study the nature of the functions so 


obtained. 


Definition. If A and B are independent integral divisors, then 
48: R— [A, B] where for every Pe R, Fan(P?) = Dpe [A, B] and Dp is as 
defined in Theorem 5. 1. 

If A and B are independent, but not integral, divisors, then we multiply 
them by S so that AS and BS are integral. We define Fag by Fan(P) 
== S1R 45 25(P). From Theorem 3.1 it follows that this definition is inde- 


pendent of S. 

If we are given two marked lines we have pointed out that the fields 
constructed on them are isomorphic and that an isomorphism can be explicitly 
given. We pick a particular marked line <X, Y,Z> and the field xyz — ® 
and consider them as fixed henceforth. Given <A, B,C>, by the multipro- 
jectivity z1zc we mean that multiprojectivity of <A, B,C> onto <X, Y,Z> 
which takes A into 1, B into Y and C into Z. To a marked line <A, B, C> 


we now associate a function faze as foliows: 
Definition. fanc: where for every PeR, fanc(P) = (X, Y,Z; 


We define two such functions faze and frgr to be equal if for all P such 
that fasc(P) A(X, Y,2;2) Afesr(P), fasc(P) = frsr(P). 


THEOREM 5.2. For all divisors R, fase = far,BR,cr- 


Proof. From the definition of F4z,8r and Theorem 3. 1 we have that for 
all P, Farer(P?) = RFan(P). Let or(D) = RD for all divisors D. Thus 
TAR,BR,CROR iS a multiprojectivity and it takes A into XY, B into Y and C 
into Z. Thus by Lemma 2, it follows that tisc = 7ar,pr,cror- Thus for 
every P, 

far,pror(P) = (X, Y, 43 war,pr.crF ar,pr(P) ) 
(X, Z; RRF 4p(P) ) 
= (X,Y,24; = fasc(P). 


812 ISRAEL NATHAN HERSTEIN. 


We note here that because of the isomorphism of ®aze onto ® by 2* 180 
it is sufficient to carry out any calculation in ®4p¢ and derive the result for ® 
using this isomorphism. 

Now that a function has been assigned to every marked line, we attempt 
to define compositions for these functions so that under these compositions 
they form a field. We begin with the sum of two functions. 

If faso(P) = (X.Y,Z;Z) we will say that firc(P) = o. 


Definition. If fase and fprr are two functions such that CF, CD and 
AF are independent divisors, then fasc + forervy =fr.s,cr Where R and S are 
determined as in Diagram 1. 

From its definition, the addition of functions is commutative. We now 
proceed to show that fr.s.cr is equal to the sum of the two functions in our 


sense. 


THEOREM 5.3. Jf fiance and forr satisfy the conditions of the preceding 
definition, then for all P such that fasc(P)~ and foprr(P) 
(fase + forr)(P) = fanc(P) + forr(P). 


Proof. Consider Diagram 1. As was pointed out before, we can assign 
coordinates, coming from the field ®cp,r,47, to the plane and use analytic 
projective geometry. Lower case Greek letters will indicate elements of 
®cp,r,4r- We assign coordinates axes follows: 


[CF, AF] as the z-axis, [CF,CD] as the y-axis and [CD, AF] as the z 
axis. We assign, as coordinates, to CF, (0,0,1); to CF, (1,0,1); to CD, 
(1,0,0); to BF, (0,1,1): to AF, (0,1,0) and to R, (1,1,0). We map 
<CF, CE, CD> onto <CD, R, by the perspectivity defined by o.(A, 0, 
= (A,p,0). Evidently CD goes into CD, CF into R and CF into AF. We 
map <AF,BF,CF> onto <CD,R,AF> by the projectivity o,, defined by 
o1(0,%, 8) = (a, 8,0). Evidently o, takes AF into CD, BF into R and CF 
into AF. Since the equation of [CF, R] is x = y. the coordinates of a point 
on it are of the form (y, y,6). We map <CF,8S, R> onto <CD, R, AF) by the 
projectivity defined by o3(y, y,8) = (y,8,0). Evidently CF is taken into 
AF, S into R and R into CD by o;. 

If (A, 0,%), (0,%,8) and (y,y,8) are collinear, then 


=—y(an+ BA) + 


Thus if and A0, 8/y = B/a+ p/d. 


DIVISOR ALGEBRAS. 813 


By Theorem 5.2, given faszc and fpzy, we may always assume that A 
and C are relatively prime integral divisors; and we may also make this 
assumption in regards to D and F. Thus finc(P?) = © if and only if P|C; 
similarly fozr(?) = © if and only if P|F. From Theorem 3.2 the multiples 
of any prime divisor ? in the family [CF, CD, AF] are at least a two dimen- 
sional family. If the multiples of P are all of the plane, then P | CF, P| CD 
and P|AF. Hence by the relative primeness of A and C, and D and F, 
P|C and P|F. Hence if fasc(P) # © and foxr(P) ~ © the multiples of 
P in the plane form a line. Thus Prs(P), Farcr(P) and Foren(P) are 
collinear. Hence if their coordinates are (y,y,8), (0,%,8) and (A,0,,), 
respectively, then « ~0 and and 8/y =B/a+p/dr. o2, are the 
multiprojectivities used for defining fasc, fozr and fr,s,cr relative to ®cp,r,1F 
and fasc(P) = forr(P) and freor(P) =8/y. Hence the 
theorem. 


Definition. If then where = (X, Y,Z: 
m1Bc(B’). 


THEOREM 5.4. For every prime divisor P, (Afasc)(P) =A(fasc(P)). 


Proof. Let Since B’S4 A, wanc(B’) = RAN. We 


define o as follows: 


= (4, Y,2;T). Thus projec- 
tivity, and is a multiprojectivity. Since (VY, 
= (X,Y,2;X), omanc(A) =o(X) =X. Similarly =Z. Since 
=(X,Y,Z;R), o( R) = Y. Thus 
= Y, from the definition of R. But zagc is also a multiprojectivity taking 
A into X, B’ into Y and @ into Z. Thus from Lemma 2, omiac = rape. 


Thus we have, 

(Afasc)(P) = fanc(P) = (X, = (X,Y, 
= (X,Y,Z:m4ncFac(P)) -(X, Y, Z; R)-' from the definition of o. 
== fanc(P)A since (X, Y,Z; 


Hence we have the theorem. 


Definition. If Xe® then fasc +A—fawe where —A=—(X,Y,Z: 
mspo(A’)), and —A 1= (X, Z; mapc(B’)). 

Using the exact type of argument as in the proof of Theorem 5. 4, with 
the obvious modifications we obtain 


814 ISRAEL NATHAN HERSTEIN. 


THEOREM 5.5. For every prime divisor P, 


(fanc + A)(P) = faso(P) +A. 


The following theorems, essentially the converses of Theorems 5.4 and 
5. 5, can also be proved along these lines, 


THEOREM 5.6. For every prime divisor P, fanc(P) faac(P) where 
A = (X, Y, Z; wazc(B’)) 


THEOREM 5.7. If A, A’, and C are dependent divisors, then fase 
where dA, pe®. 


We now return to the case of the sum of fasc and fppr where CF, AF 
and CD are dependent. Since we may assume that A and C, and D and I 
are relatively prime integral divisors (Theorem 5.2), from A and C prime to 
each other it follows by Theorem 3.1 that C|F. Similarly we can obtain 
that F|C. Hence F =C. Thus A, D,C are dependent, and so from Theorem 
5.7% we have that fasc =Afper where If we define 
faso + forr = frsc where frse = Applying Theorems 5. 5 
and 5.6 we have that in this case also that for those prime divisors ?’ for 
which neither function is infinite, (fasc + fozr)(P) = fasc(P) + foer(P). 
If A =-- 1, we define fasc + forr =» and again for those prime divisors P 
for which neither function is infinite, + fozr)(P) = fasc(P) + fonr(P). 

Now that addition has been defined and shown to have some desirable 
properties, we turn to multiplication. We proceed as follows. P 


Definition. If fazc and fpgr are such that AD, CD, and CF are inde- 
pendent divisors, then fascfpzr =fap,u,cr Where M is determined as in 
Diagram 2. 

Using analytic projective geometry, and proceeding as in the proof of 
Theorem 5.3 we obtain 


TueEorEM 5.8. If fasc and fpnr satisfy the conditions of the preceding 
definition, then for all P such that fanc(P) ~~, and forr(P) ~~, 


(fascfprr)(P) = fasc(P)forr(P). 


We now return to the definition of multiplication. Suppose that fazc 
and fpr are two functions. Without loss of generality we may assume that 
A and C, and also D and F are relatively prime integral divisors. If AD, 
COD, and CF are independent, then by definition fascfpzr = fap.u.cr; and if 
AD, AF and CF are independent, fpzrfasc =fav,v,cr (see Diagram 3). 


C 


DIVISOR ALGEBRAS. 815 


Pick a P so that P{CF and PTAD. Then there is a Dpe [AD,CF] such 
that P|Dp. Since fasc(P) 4 and fpzr(P) 4, by Theorem 5.8 


(fascfonr)(P) = fasc(P)fozr(P) = foer(P)faso(P) = fourfasc) (P). 


Thus wap,v,crDp = Since the two multiprojectivities coincide 
on the distinct points AD, CF and Dp, the multiprojectivities are equal, and 
so M = N, and the multiplication is commutative, in this case. 

If AD, CD, and CF are dependent, then C|D and D|C, thus C = D. 
If in addition AF, AD, and CF are also dependent, then AF. Hence in 
this case fpzr =fcea. Combining the following theorem and Theorem 
5.6, then fpxr =Afasc and for all P for which fpzr is not infinite, 
=A. We define fascforr =X. 


co 

BD 

CE 
AD CF M 
N 

BF 

AE 

AF 
DIAGRAM 3. 


If AF, AD, and OF are independent then fpzrfasc can be constructed. 
In this case we define fascfpzr =fpxrfanc. Thus for all P such that 
fase(P) and fpxr(P) ~ we have 


fascforr)(P) = = fonr(P)fase(P) = fasc(P)forr(P). 


Hence in all cases (fascfprr)(P) =fasc(P)fonr(P); moreover the multi- 
plication is commutative. 


THEOREM 5.9. For all P such that fanc(P) ~ ©, fase(P) = (fora(P))". 
Proof. We define o as follows: 

(1) o(X) =Z; 

(2) o(Z) =X; 


1 
J 
0 
n 
ie 
yr 
P 
). 
le 
n 
BC 
at 
D, 
if 
). 


816 ISRAEL NATHAN HERSTEIN. 


(3) for alli RAX and RAZ, (X, Y,Z;0(R)) = (X,Y, 2; BR)". 
Thus o(Y)=Y. Since o is a projectivity, orazc is a multiprojectivity ; 
proceeding as in the proof of Theorem 5.4 we obtain this theorem. 


By G<R, 8, T> we mean the marked line <?G, SG. TG). 


THEOREM 5.10. Jf fasc=frsr then for some divisor G, <A, B,C) 
= G<R, 8,T). 


Proof. If CT, AT and CR are independent, then faiscfrsr is not a 
constant. But since fascfrse =1 from Theorem 5.9, CT, AT, and CR are 
dependent. Without loss of generality we can assume that A and C and also 
T and BR are relatively prime integral divisors. Then C =T' since C| T and 
T|C. Thus C, A and R are dependent. Since A and C are relatively prime 
and since Re [A,C], RA A would imply that there exists a P with P| A 
and P{R. Thus fasc(P) = (X, Y,Z;X) and frsr(P) ~ (X, Z; X), con- 
tradicting the hypothesis. Thus ARF. Similarly B=S. If A and C are 
not relatively prime, then writing A = A’U and C=C’U, where A’ and (" 
are relatively prime and integral, the result comes immediately. Combining 


Theorems 5. 2 and 5. 10 we have 


CoROLLARY. fasc=frsr if and only if <A,B,C>=G<R,8,T) for 
some G. 


Lemma 3. If fasc(P) =fonr(P) for all but a finite: number of P’s, 
then fanc = foer. 
Proof. Without loss of generality, A and C, and also D and F, are 


relatively prime integral divisors. Since, for all but a finite number of P 
(fancfrep) (P) = fasc(P)freo(P) = fanc(P) (fozr(P))“* =1, (hypothesis 
and Theorem 5.9), fascfrep takes on only a finite number of values. But 
this can only happen when AF —CD. From the relative primeness of the 
factors involved, A= D and F=:C. From the uniqueness of a multiprojec- 
tivity taking three collinear point into three others, it follows that B= F. 


Thus the lemma. 
Using the results of Theorems 5. 3-5. 10 and the above lemma, we easily 


obtain 


THEOREM 5.11. For all functions forr, faux, we have 
1. (fase + foer) + =fasc + (foer + four) ; 


2. (fascfoer) = 


| 
d 
n 
Ww 
C 
a, 
th 


DIVISOR ALGEBRAS. 817 


3. fare(forr + foux) = fascfper + 


Let Ag be the set of all functions faze (that is, where <A, B, Cd is any 
marked line) with the field @ adjoined. Thus we can introduce in Kg two 
operations which are commutative, associative, distributive. In addition (by 
Theorem 5.6) we have a multiplicative unit, and (by Theorem 5.5) an 
additive unit element. Theorem 5.9 gives us the multiplicative inverses, and 
if we define — f = (— 1)f we obtain 


THEOREM 5.12. Kg is a field under the compositions defined. 


6. Linear dependence in Kg. In this section we will study the relation 
between the dependence of a finite sequence of principal divisors, (that is 


divisors which are in the principal subgroup) 4,/C1,° and the 
linear dependence of the functions f4,8,c,,.° °°, f4,8,c, over the field ®, for 


all relevant choices of the B,. 


THEOREM 6.1. Jf the principal divisors A,/C,,: ,An/Cn are depen- 
dent, then the functions + for all relevant choices of the Bi, 
are linearly dependent over ®. 


Proof. By induction over n. 


1) If n= 2, then since A,/C, and are dependent, they are equal. 
Hence A,C, = A.C,. From Theorem 5.2, combined with Theorem 5. 6, 


Thus the theorem is true for n = 2. 
2) n= 3B. 


Since A,/Ci,° +, are dependent, A,C2C3, A2C,C3, AsC,C. are depen- 
dent. If any two of them are equal, the result will follow from the case 
n== 2. Suppose the three are distinct. Without loss of generality each pair 
Ai, Cy are relatively prime integral divisors. Thus if A,C2C,, A3C,C. and 
C,C.C; are dependent, then C, = C;; and then A,, A, and C, are dependent. 
Thus by Theorem 5. 7, f.1,8,c, = + Where Their dependence 
would also force the dependence of AsC,C3, and C,C2C;. Thus 
C, = C, and Az, As, and (, are dependent. Thus = %4,8,c, + B where 
a,8e. Combining the two expressions obtained, the linear dependence of 
the functions over ® follows. 

So let us assume that A,C.C;, 43C,C3, and C,C.C, are independent. Thus 


| 
: 

e 

0 

le 

re 

P 

ut : 

he 

C= 

ly 


ds138 ISRAEL NATHAN HERSTEIN. 


the points inducted in Diagram 4 all lie in the plane determined by these three 
independent points. Using B’,C.C, and B’;C,C, as obtained in the diagram, 
we have, using Theorem 5.6, that and = pf 
where A,ne®. Also from the definition of the sum of two functions, 


= faynsc, + = Afa.B,c, + and hence the theorem for 
n= 3. 
3) Suppose the theorem is true for all n<r (r>3). 


A, 


DIAGRAM 4. 


Consider A,/C;,:--,A,;/Cr, a sequence of dependent principal divisors. 
If any r—1 of them or less are dependent, then we are done. Hence we 
suppose any r—1 of them or less are independent. Since r > 3, Ai/C;, 
A2/Cz are independent and A;/C;,---,Ar/C, are independent. But since 
the A;,/Ci, i=1,- - -,r are dependent, by axiom 3c. there is a divisor R/T 
which depends on both Ai/C;, A2/C. and on A;/C3,:--,Ar/C;. Hence by 
the induction hypothesis frsr is linearly dependent (over ©) on f4,8,0, fasBwr 
and f4,b,c, Thus the fa,z,c,, are linearly depen- 


dent over ©; completing the induction. 


TuEorEM 6.2. If are linearly dependent over 
then A,/C;,- --,An/Cn are dependent divisors. 
Proof. By induction over n: 


1) If n = 2, f4,B,C, = AfAsBoC, = f AgB’2Ce by Theorem 5. 6, and so by 
Theorem 5. 10, A,/C, = As/C2. 


A,0,C5 
B,C,C2 ] 
t 
1 
a 
tl 
a 
al 
th 
alg 
fur 
we 
alg 
| 


by 


DIVISOR ALGEBRAS. 819 


2) If n=83 then by hypothesis favo, If any Ay == 0, then 


the result follows from the case n= 2. Let us assume that all the A; +0. 
Then by Theorem 5.6 = faye yc, and — = = 
+ Constructing the sum + We obtain a function fr,s,¢.c, 
(not a constant since it must be equal to f1,2",c, which is not a constant) 
where, in all cases of addition, R/C2C3, As/C2, A;/Cs; are dependent. Since 
faye, = fr,s,c.c, by Theorem 5.10 A,/C,—R/C.C;. Hence the three 
divisors A,/C,, and A;/C3; are dependent. 


3) Suppose the theorem is true for n—=r—1=3. 


Suppose that SAifa,z,c, = 0, and not all the A; = 0: If any A; = 0 then the 
1 


result follows from the induction hypothesis. So we may assume that no 

A=0. Let frsr = Aifa,z,c, If frsr is not a constant, then by the 
1 


induction hypothesis, and since no A; 0, R/T depends on A,/C,,:--, 
Apo/Cr-2. Since frsr + + Arf4,B,c, from the case n = 3, 
R/T depends on A,;.,/C;, and A,/C,. Thus A,/C, depends on A,_,/C,_, and 
R/T and so on Ay_;/Cy-1, A1/Ci,* +, Ar-2/Cr-2. Thus in this case the 
theorem would be true. So suppose that frsr is a constant. Thus, from 
Theorems 5.5 and 5.7 we may assume that and e[A,, C,] 
r-1. 
since f4,B,C, = af B where Be, Now it = p 
and if frg-7 is not a constant the result will follow as above. So suppose 
that A: f4,p,c, + Ar4,z,c, = % in ®. Hence again we may assume that C, = C, 
and A,e[A,,C,]. Thus A,,A,., A, are dependent, and since the three C’s 
are equal, A,/C;, Ari/Cy1,A,/Cy are dependent. Thus we have proved the 
theorem. 
Combining Theorems 6.1 and 6.2 we have 


THEOREM 6.3. The functions fAnByC, linearly dependent 
over if and only if the divisors A,/C;,: An/Cn are dependent. 


7. Algebraic properties of Kg. We will now consider some of the 
algebraic properties of Kg; in fact we will show that Kg is an algebraic 
function field in one variable over the algebraically closed field &. Moreover 
we show that if K is an algebraic function field in one variable over an 
algebraically closed constant field &, and & is the divisor algebra of K, then 
K is isomorphic to Kg. 


| 

C3 
8, 

T'S. 
we 
C;, 
nce 
/T 

by 
BoU2 
en- 


820 ISRAEL NATHAN HERSTEIN. 


We begin with the following lemma: 
Lemma 4. If A is an integral divisor then —1< n(A) —dim {A}. 


Proof. By Theorem 3.4, if A|B then n(A) —dim {A} <= n(B) 
—dim {B}. If A is integral then J|A and so the result follows immediately. 


THEOREM 7.1. Jf fance Ke then farce is transcendental over ®. 


Proof. Since AC, by the corollary to Theorem 3.6, for all n>0 


A”, are independent divisors. Hence by Theorem 6.3 the 
functions f4"z,c".° are linearly independent over ®, for all xn. From 
the definition of multiplication fa‘z,c‘ =Aifiise. Thus we have shown that 


the positive integral powers of fance are linearly independent over ®. 


THEOREM 7.2. Jf A and B, and also C and D are independent integral 
divisors, then there exist integers m and n such that the divisors AiCi/BiD), 
=0,1,---, are dependent. In fact n=n(A) and 
m= (n(C) —1)n(A) +1 will do. 


Proof. We multiply the sequence {4‘/B‘é-Ci/D!} by B“D". We then 
obtain a sequence DP" )} of integral divisors. If for some ig or 
jo = then (A/B)** = (C/D)i-*%, Thus 
they would be dependent and the theorem would be true. If they are all 
distinct then we would have mn -+ m+ n+ 1 distinct integral divisors. If 
we could force dim {A”"C"} < mn + m+ n-+ 1 for some m and n, then the 
divisors A‘B”-iCiD"-i would be dependent. By Lemma 4 dim {A”0"} 
= n(A”C") +1. Thus if for some m and n, n(A”"C") < mn +m -+ n, we 
are finished. Hence we try to satisfy m-n(A) + n-n(C) mn+-m-+n. 
We see that n= n(A) and m= (n(C) —1)(n(A)) +1 will do. 


THEOREM 7.3. Jf X, Ke then there exist a finite number of 
not all 0, such that = 0. 


Proof. If either X or Y is in ® the theorem is trivial. Thus we may 


assume that and Y where A, B,---,F are all integral 
divisors. From Theorem 7.2 there exist integers m and n such that the 
divisors (A‘/€‘)(Di/F/) with 1=0,1,---,m and are 


dependent. Thus the functions faips, v,,, civi are linearly dependent over ®. 
Thus from the definition of the product of two functions, the functions 


fiancfiprr are linearly dependent over ®. 


CoroLtary 1. Kg is of degree of transcendence one over ®. 


a 


e 

i 

| 
A 
bi 
th 
CO 
Tr 
th 
al 
the 
the 
Se} 
Si 
ant 
ant 
mo 
ext 
anc 

the 


DIVISOR ALGEBRAS. 821 


Definition. An extension field A’ of a field & will be said to be algebraic 
and of bounded degree over / if there exists an integer N > 0 such that for 
every ee K there exists a polynomial p, of degree at most NV, with coefficients 
in k and such that p.(v) = 0. 


CoroLuary 2. AG is algebraic and of bounded degree over ®(fazc). 


Proof. From Theorem 7.3 Ky is algebraic over ®(fasc). If A and C 
are relatively prime integral divisors, as we may assume, then by Theorem 7. 2, 
NV = n(A) will satisfy the requirements of the above definition. 


THEOREM 7.4. ©® is algebraically closed. 


Proof. To show the algebraic closure it is sufficient to show the reduci- 
bilitv of all non-linear sicments in [a] for 2 any element transcendental 
over ® Pick xv = fine. and let S ajar! be any polynomial in @[a]. Construct 

i= =0 
the sum aiflise by Theorem 7.1 Rs4C". Without loss of 
generality R and (™ are relatively prime, for sible we could take out the 
common factor. Thus we can find a P such that P|R, PTC". Thus 
frs.c"(P) = 0. and so fr.s.c" has as a linear factor, farce —fasc(P). Hence 


the theorem. 


THEOREM 7.5.‘ Jf K is transcendental over the perfect field k and K is 
algebraic and of bounded degree over k(x), then K is a finite extension of k(x). 


Proof. If i: is of characteristic zero then A is separable over &(a) and 


the result is immediate. 


Suppose /: is of characteristic ps4 0. Then since i is perfect it contains 
the p-th roots of all its elements. Let S be the set of all f in A which are 
separable over k:(av).- Then, by Van der Waerden [3, p. 129] S is a field. 
Since § is separable, algebraic and of bounded degree over /(.x), it is a finite, 
and hence simple, extension of k(x). Let S(x,y). Since KX is algebraic 
and of bounded degree over k:(2) and i: is perfect, for sufficiently large 1, 
S(at, yt) D K where t=1/p". But the degree of eal yt) over S is at 
most ¢? and hence. because S is a finite extension of :(2). S(2*, y‘) is a finite 
extension of k(a). Since K C S(at,y*'), K is also over h(a). 

Since in our case Kg is algebraic and of bounded degree over ®(f4zc) 
and ® is algebraically closed, and hence perfect. we immediately have 


‘The author is indebted to Professor George Whaples for this simple proof of the 
theorem, 


822 ISRAEL NATHAN HERSTEIN. 


THEOREM 7.6. If fasce Ke then Kg is a finite extension of ®(fasc). 


Or using MacLane’s definition [1, chapter 1] Ke is an algebraic function 
field in one variable over ®. 

Let us now return to the concrete case of an algebraic function field K 
(in one variable) over the algebraically closed field *. Suppose that & is 
the divisor group of K, geometrized as in Section 1. Thus we know that f 
is isomorphic to the fields ®yyz (Theorem 1.1) and we pick a definite one ®. 
Then if f<> <A, B,C> (see Section 1) we define o*(f) =fanc. If o is the 
isomorphism of & onto ®, we define o*(a) =o(a) for a in k. Thus from 
Theorems 1.2 and 1.3 we have 


THEOREM 7.7. K is isomorphic to Kg under the isomorphism o*. 


Before closing we would like to point out that in the definitions of sums 
and products of our functions the concept of prime divisor was not used. 
Thus if we consider a projective space whose elements form a group and 
where group multiplication preserves collinearity, such sums and products 
(of marked lines) could be defined. The prime divisors were used to justify 
that these definitions of sum and product lead to a field. 


INDIANA UNIVERSITY. 


BIBLIOGRAPHY. 


[1] S. MacLane, Lecture notes on “ Algebraic Functions,’ Cambridge 1947. 
[2] Hodge and Pedoe, “ Methods of Algebraic Geometry,” vol. 1, Cambridge 1947. 
[3] B. L. Van der Waerden, “ Moderne Algebra,” vol. 1, Second Edition, Berlin 1940. 


ext 


TO 
in 
ge 
= 
an 
pa 
res 
a 
im) 
use 
say 
Ho 
nor 
In 
if a 
stoc 
priz 
coin 
is e 
follc 
only 
desi 
mul 
that 
Thu 
in J 
17, 


PRIME IDEALS IN GENERAL RINGS.* 


By H. McCoy. 


_1. Introduction. The concept of prime ideal has played an important 
role in the theory of commutative rings, but has not been used so extensively 
in the study of noncommutative rings. Some properties of prime ideals in 
general rings have been discussed by Krull [9] and by Fitting [3]. However, 
except in these papers, prime ideals seem to have been used only incidentally 
.and not made the subject of special study. It is the purpose of the present 
paper to extend to general, that is, not necessarily commutative, rings several 
results which are well known in the commutative case. 

Unless otherwise stated, the word ideal shall mean two-sided ideal. In 
a commutative ring ? an ideal p is a prime ideal if and only if ab =0(p) 
implies that a=0(p) or b= 0(p). Naturally, this definition could also be 
used in noncommutative rings, as has been pointed out by Fitting [3], who 
says that a prime ideal according to this definition is completely prime. 
However, it turns out that this concept is not particularly useful, since a 
noncommutative ring seldom contains very many completely prime ideals. 
In other words, the defining condition is too strong to be of much interest. 

In an arbitrary ring R, it is customary to call an ideal p a prime ideal 
if and only if ab =0()) implies that a=0(p) or b=0(p), it being under- 
stood that a and b are ideals in #. An ideal which is completely prime is 
prime, but the converse is not generally true. However, these concepts 
coincide in the case of commutative rings. 

Our first theorem gives a number of properties of an ideal, each of which 
is equivalent to that just used to define a prime ideal. In particular, it 
follows that an ideal » in the arbitrary ring R is a prime ideal in R if and 
only if aRb =0(p) implies that a=0(p) or b=0(p). This suggests the 
desirability of defining an m-system (generalizing the familiar concept ‘of 
multiplicative system) IM of elements of R as a system with the property 
that ce M, de M imply the existence of an element x of R such that cade M. 
Thus an ideal p in R# is a prime ideal if and only if the complement of p 
in R is an m-system. This characterization of the prime ideals plays an 


important role in the sequel. 


* Received July 20, 1948; presented to the American Mathematical Society, April 


17, 1948. 
823 


| 


824 NEAL H. MCCOY. 


If a is an ideal in R, the radical of the ideal a is defined to be the set 
of all elements r of R with the property that every m-system which contains ¢ 
contains an element of a.‘ It is shown in 8 that the radical of a is the inter- 
section of all the prime ideals which contain a. The methods are based on 
those of Krull [10], and the results reduce to those of Krull if R happens to be 


a commutative ring. The material of 8 is a simple adaptation of the exposition 


of Krull’s results to be found in Chapter V of [14]. 

A considerable number of different definitions of the radical of a general 
ring have been proposed. We shall add to this list by giving still another 
definition as follows. The radical N of the ring FR is the radical of the zero 
ideal in R. We shal! show that N is a nil ideal which contains every nil- 


potent ideal of R, and that N is a radical ideal in the sense of Baer [1]. The 


relation of N to the radicals of Kothe [8] and Levitzki [11] and [12] is still 
an unsolved problem. In common with all the other definitions of the 
radical of a general ring, N becomes the classical radical in the presence of 
the descending chain condition for right ideals. Furthermore, NV has all 
the usual properties expected of a radical. 

A primitive ideal as defined by Jacobson [7] is a prime ideal, and hence 
N is contained in the Jacobson radical of R. We may also point out. that 
the method used by Jacobson [7] to introduce a topology in the set of 
primitive ideals in a ring can be used without modification to introduce a 
topology in the set of prime ideals in a ring. In fact, several of the results 
of [7] can be easily carried over to results about the space of prime ideals 
in a ring. 

The radical recently defined by Brown and McCoy [2] is also the 
intersection of a certain class of prime ideals, namely, those maximal ideals 
m such that R/m has a unit element. 

A ring in which (0) is a prime ideal may be called a prime ring. Thus 
the primitive rings of Jacobson [6] are prime rings. In 5 we shall prove 
that a prime ring which contains minimal right ideals is a primitive ring. 
However, these concepts do not coincide in general, for any integral domain 
is a prime ring and an integral domain is primitive if and only if it is a 
field. 

We shall point out in Theorem 6 that a ring is isomorphic to a sub- 
direct sum of prime rings if and only if it has zero radical. This is an 


1 Fitting [3] defined the radical of a to be the set of elements which generate nil 
ideals modulo a. The radical of a as defined above is contained in Fitting’s radical, 
but the exact relation between these concepts is an unsolved problem. 


0 
( 
D 
f 
t 
0 
it 
tl 
t 
| 


PRIME IDEALS IN GENERAL RINGS. 825 


analogue of one of the Wedderburn-Artin structure theorems. In view of 
this result it would seem desirable to make a further study of the prime rings. 


2. Definition and fundamental properties. We begin by proving the 
following result: 


THEOREM 1. Jf p is an ideal in the arbitrary ring R, the following 
conditions are equivalent : 


(i) Jf a, b are ideals in R such that ab=0(p), then a=0(p) or 
b=0(p). 


(ii) If (a), (b) are principal ideals in R such that (a) (b) =0(p), 
then a=0(p) or b= 0(p). 


(ili) If aRb=0(p), then a=0(p) or b=0(p). 


(iv) Jf I1, Iz are right ideals in R such that I,Iz=0(p), then 
I, =0(p) or I, =0(p). 


(v) If J, J2 are left ideals in R such that J,J2=0(p), then J, =0(p) 
or J,=0(p). 


. Before giving the proof we make one observation which will be useful. 
Clearly (i), although stated for the product of two ideals, implies that if a 
product of any finite number of ideals is in p, at least one of the ideals is 
in p. A similar result holds for (ii), but is not quite so obvious. However, 
suppose that (ii) holds and that (a)(b)(c) =0(p) with a¥0(p). Then 
for every b, in (b), c, in (c), we have (a) (bic,) =0(p), which then implies 
that b,c; =0(p). This shows that (b)(c) =0(p), and hence }==0(p) or 
c=0(p). In like manner, the result can be established for the product 
of any finite number of principal ideals. 

We are now ready to prove the theorem. Clearly (i) implies (ii). 
We now assume (ii) and prove (iii). Suppose that aRb =0(p), from which 
it follows that RaRbDR=0(p), and thus (a)*(b)*C RaRbR=0O(p). By 
the observation made above, (ii) implies that a=0(p) or b=0(p), and 
this establishes (iii). 

Now let us assume (iii) and suppose that J,, J, are right ideals such 
that J,J,=0(p) with I, £0(p). Let a, be an element of J, not in p. Then 
for every element a, of J, we have a,Ra,CJ1,J,=0(p). Hence, by (iii), 


l 

f 

] 

t 

a . 

iS 

1s 

n 

a 

)- 

n 

il 

1, 


826 NEAL H. MCCOY. 


we have a,=0(p). Thus J,=0(p), and we have therefore shown / 
(iii) implies (iv). A similar argument will show that also (iii) implies (v). 

The proof is completed by observing that (i) is implied by either (iv) 
or (v). 


Definition 1. An ideal p with any one (and therefore all) of the 
properties stated in Theorem 1 is a prime ideal. 


Lemma 1. If p ts a prime ideal in R, and a an element of R such that 
Rak = 0(p), then a=0(p). 


To prove this, we observe that RaR =0(p) implies that aRaR =0(p), 
and (iv) shows that aR=0(p). It then follows that aRa=vJ(p), and we 
must have a=0(p) by (iii). 


We next prove the following result: 


Lemma 2. If 6 is an ideal in R, and p a prime ideal in = then b[) p 
is a prime ideal in the ring 6. 


Let b;, b2 be elements of b such that b,6b,==0(pf{]b). Then b,Rb.Rb, 
C 6b,6b, =0(p), and hence 0,Rb.Rb.R=0(p). From this, (iv) implies that 
b,R=0(p) or b.R=O(p). If b,.R=0(p), then 6,Rb;=0(p) and (iii) 
implies that b,=0(p). Similarly, if b.R=0(p), we have b.=0(p). 
Thus either 6; =0(p{] 6) or bs =0(p and 5b is a prime ideal in 
the ring b by (iii). 


Definition 2. A set M of elements of R is an m-system if and only if 
ce M, de M imply that there exists an element x of R that crde M. The 


void set is to be considered as an m-system. 


The importance of this concept lies in the fact that, by (iii), an ideal p 
in R is a prime ideal if and only if its complement C(p) in # is an m-system. 
The agreement to consider the void set as an m-system is to take care of the 
special case in which p= R, for clearly R is a prime ideal in R. 

It will be observed that the concept of-an m-system is a generalization 
of that of multiplicative system. For if M is a multiplicative system with 
ce M, deM, then there is an element x (either c or d may be used) such 
that crde M, and hence M is an m-system according to the definition given 


above. 


in 


| 

s 

t 

| 

fo 

| Py 

| 

Te 


KY 


PRIME IDEALS IN GENERAL RINGS. 827 


3. The radical of an ideal. This section is based on certain material 
of Chapter V of [14] which, in turn, is largely an exposition of results due 
to Krull [10]. 


Definition 3. The radical r of an ideal a in R consists of those elements 
r of R with the property that every m-system which contains r contains an 
element of a. 


It will presently appear that r is an ideal in R. However, we first 
observe that aC r. Furthermore, a and r are contained in precisely the 
same prime ideals. For suppose that a C p, where p is a prime ideal, and 
that rer. If r were not in p, that is, if reC(p), then C(p) would have to 
contain an element of a since C(p) is an m-system. But clearly C(p) 
contains no element of a, and therefore 7 is not in C(p). Thus rep, and 
hence r C p as required. 


Definition 4. A prime ideal p is a minimal prime ideal belonging to 
the ideal a ‘and only if aC p and there exists no prime ideal p’ such that 

We are now ready to state the principal theorem of this section as 
follows: 


THEOREM 2. The radical r of an ideal a is the intersection of all the 
minimal prime ideals belonging to a. 


We shall establish several lemmas and then show how they lead to an 
immediate proof of the theorem. 

If two sets of elements of R have no elements in common, we may say 
that either of these sets does not meet the other. 


Lemma 3. Let a be an ideal in R, and M an m-system which does not 
meet a. Then M is contained in an m-system M’ which is maximal in the 
class of m-systems which do not meet a. 


This is, of course, an immediate consequence of Zorn’s Maximum 
Principle and is merely stated in the form of a lemma for convenience of 


reference. 


Lemma 4. Let M be an m-system in R, and a an ideal which does not 


2 By »’ C we mean that is properly contained in ?. 


| 
| 
| 
| 
| 


828 NEAL H. MCCOY. 


meet M. Then a is contained in an ideal p* which is maximal in the class 
of ideals which do not meet M. The ideal p* is necessarily a prime ideal. 


The existence of p* follows at once from the Maximum Principle. We 
now show that p* is a prime ideal. Suppose that as40(p*) and b}£0(p*). 
Then the maximal property of p* implies that (p*,a) contains an element 
m, of M, and likewise (p*,b) contains an element m, of M. Thus there 
exist elements a, of (a), b,; of (b) such that m,=a,(p*), mz=b,(p*). 
Since M is an m-system, there is an element x of R such that m,am.e M, and 
hence m,rm, 54 0(p*) since p* does not meet M. But a,7b, = m,rm.(p*) 
and therefore a,rb, 40(p*). However, (a)(0) contains the element a,zb,, 
and thus (a)(b) 0(p*). By property (ii) of Theorem 1, this shows that 
p* is a prime ideal. 


We now prove 


Lemma 5. A set p of elements of the ring R is a minimal prime ideal 
belonging to a if and only if C(p) ts maximal in the class of m-systems 


which do not meet a. 


First, let p be a set of elements of R with the property that M = C(p) 
is a maximal m-system which does not meet a. If p* is the prime ideal 
whose existence is asserted in Lemma 4, then C(p*) is an -m-system which 
contains M and does not meet a. The maximal property of M implies that 
C(p*) = M=C(p), and hence p—p*. Thus p is a prime ideal containing 
a. Clearly, there can exist no prime ideal p, such that aC p, C py, since 
this would imply that C(p.) is an m-system which does not meet a and 
properly contains M@. This is impossible because of the maximal property 
of M; hence p is a minimal prime ideal belonging to a. 


Conversely, if p is a minimal prime ideal belonging to a, M—C(p) 
is an m-system which does not meet a, and Lemma 3 shows the existence of a 
maximal m-system M’ which contains M and does not meet a. By the part 
of the theorem just proved, C(M’) =p’ is a minimal prime ideal belonging 
to a. Since M’ > M, it follows that p’C p. .Thus aC p’ Cy, from which 
it follows that p= p’, and thus  —M’. This shows that C(p) =M is a 
maximal m-system which does not meet a, and completes the proof of the 
lemma. 


We are now ready to prove the theorem. If r is the radical of a, we 


in 


ide 


cor 
it 


W 
it 
th 
0, 
a 
su 
in 
th 
Le 


PRIME IDEALS IN GENERAL RINGS. 829 


have pointed out above that r is contained in the same prime ideals as a. 
This shows that r is contained in the intersection of all the minimal prime 
ideals belonging to a. Now let a be an element of R not in r. Hence, by 
the definition of r, there exists an m-system M which contains a but does 
not meet a. By Lemma 3, J/ is contained in a maximal m-system M’ which 
does not meet a. By Lemma 5, C(M’) is a minimal prime ideal belonging 
to a, and clearly C(M/’) does not contain a. Hence a can not be in the inter- 
section of all the minimal prime ideals belonging to a, and the theorem is 
therefore established. 

The following result is an immediate consequence of the theorem just 


proved : 
CoroutuaRyY. The radical of an ideal is an ideal. 


If p is any prime ideal containing a, then M—C(p) is an m-system 
which does not meet a. If J’ is the m-system defined in Lemma 3, Lemma 5 
shows that ('(M’) is a minimal prime ideal belonging to a. Since C(p) C M’, 
it follows that aC C(I’) Cp. This proves that any prime ideal which 
contains a contains a minimal prime ideal belonging to a. 


4, The radical of a ring. We now make the following definition : 


Definition 5. The radical of the ring R is the radical of the zero ideal 
in R. 

We shall henceforth denote the radical of the ring R by N. It is clear 
that N is a nil ideal, for if ae N, the m-system {a, a’, a*,- - -} must contain 
0, and a is therefore nilpotent. Furthermore, every element b which generates 
a nilpotent ideal (right, left, or two-sided) is in N. For if J is an ideal 
such that J” 0, then J"==0(p) for every prime ideal p in R, and this 
implies that [=0(p). Hence [=0(J), since N is the intersection of all 
the prime ideals in R. 

If ae N, then clearly aR CN. Conversely, if ak CN, RaR CN, and 
Lemma 1 shows that ae V. We see therefore that aR C N if and only if ae N. 


THEOREM 3. In the presence of the descending chain condition for right 
ideals, N coincides with the classical radical of R. 


If ae N, (a) is a nil ideal, and it is known that the descending chain 
condition implies that (a) is then a nilpotent ideal. On the other hand, 
it was pointed out above that N contains all elements which generate nil- 


830 NEAL H. MCCOY. 


potent ideals. Hence N consists precisely of the elements which generate 
nilpotent ideals. This, however, is one of the familiar characterizations of 
the classical radical, and the proof is completed. 


We shall now prove 


THEOREM 4. If b is an ideal in R, the radical of the ring b is b() N. 


If N’ denotes the radical of the ring b, Lemma 2 shows that VN’ Cb) N. 
Conversely, if be b {| N, then every m-system in R which contains 6b contains 
0. Thus, in particular, every m-system in 6 which contains b contains 0. 
This means that be N’, and thus b {) N CN’, completing the proof. 


TuroremM 5. If N ts the radical of R, then R/N has zero radical. 


To prove this, let @ be an element of the radical of R/N, and thus @ is 
contained in all prime ideals in R/N. If 440, as0(N), and hence 
a is not contained in some prime ideal p in R. Since pN, we have 
R/p = (R/N)/(b/N), from which it follows that p/N is a prime ideal in 
R/N. Furthermore, p/N does not contain d since a#40(p). This contra- 
diction shows that we must have @=0, which completes the proof of the 
theorem. 

It follows from this theorem that R/N contains no nonzero nilpotent 
ideals (right, left, or two-sided), for every nilpotent ideal in R/N must be 
in the radical of R/N. In particular, this shows that NV is a radical ideal 


in the sense of Baer [1]. 


5. Prime rings. We shall now make the following 


Definition 6. A ring R is a prime ring if and only if (0) is a prime 
ideal in R. 


Theorem 1 yields a number of equivalent characterizations of the prime 
rings, one of the most interesting being that a ring R is a prime ring if and 
only if aRb =0 implies that a=0 or b =0. 

It is easy to see that a commutative prime ring is just an integral domain. 
Any simple ring S (with S?5£0) is a prime ring, and a primitive ring is 
also prime, as was shown by Jacobson [6]. From Lemma 2 we also observe 
that an ideal in a prime ring is a prime ring. 

Now a prime ring has zero radical and hence in the presence of the 
descending chain condition for right ideals is isomorphic to a direct sum of 


I 
i 
a 
a 
af 
a 
Pp 
ar 
ex 
as 


PRIME IDEALS IN GENERAL RINGS. 831 


a finite number of simple rings. However, the direct sum of two or more 
simple rings is certainly not prime, and hence if the descending chain con- 
dition holds for right ideals, the concepts of prime ring and simple ring 
(with nonzero square) coincide. 

If p is a prime ideal in the arbitrary ring R, R/p is a prime ring, and 
conversely. Since NV is the intersection of all the prime ideals in R, a familiar 
argument * yields the following analogue of one of the Wedderburn-Artin 


theorems: 


THEOREM 6. A necessary and sufficient condition that a ring be iso- 
morphic to a subdirect sum of prime rings is that it have zero radical. 


This theorem indicates the importance of prime rings in the general 
structure theory. We shall now prove a few other results about prime rings. 


THEOREM 7. A prime ring that contains minimal right ideals is a 
primitive ring. 

The following simple proof is due to Bailey Brown. If J is a minimal 
right ideal of the prime ring R, then J is a simple R-module whose annihilator 
I* in R is a right (in fact, two-sided) ideal such that J7* —0. Since R 
is prime this implies that 7* —0 and thus J is a simple R-module with zero 
annihilator, that is, R is isomorphic to an irreducible ring of endomorphisms. 
This implies * that R is primitive, and the proof is completed. 

Now let 7’ be a ring with unit element, and denote by T, the ring of 
all matrices of order n with elements in T. We shall prove 


THEOREM 8. Jf T is a ring with umt element, then T, ts a prime ring 
if and only if T 1s a_prime ring. 


As usual, let e;; denote the matrix with the unit element in the i-th row 
and j-th column, and zeros elsewhere. If 7 is not prime, then 7, is not 
prime. For if 7 is not a prime ring, there exist nonzero elements a, b of T 
such that a7 =0. This clearly implies that (ae,,)Tn(be11) =0 with ae, 
and be,, nonzero elements of 7,, and this shows that 7, is not a prime ring. 

Conversely, suppose that 7, is not a prime ring, and hence that there 
exist nonzero matrices (aij), (bij) in Tn such that (aij)Tn(bij;) =0. Let us 
assume that apg ~0, brs 40. Now, for every x in 7 we must have 


3 See § 3 of [13] for references. 
* Jacobson [6], p. 312. 


832 NEAL H. MCCOY. 


i,j kyl inl 


In particular, the coefficient of ep, must be zero, that is, dpgtb;s = 0. Since 
this is true for every x in T, this means that @yg7b,s = 0, and T is not a 
prime ring. The proof is therefore completed. 

By use of this result we shall p <2 the following theorem about the 
radical of a ring: 


THEOREM 9. Jf N is the radical of the arbitrary ring R, the radical 
of the complete matrix ring R, ts Np. 


We first give the proof under the assumption that R has a unit element, 
and then remove this restriction. Since R is assumed to have a unit element, 
there is a one-to-one correspondence M <> M,, between ideals in R and ideals 
in R,. Furthermore, it is easily verified that (R/M),=R,/M, and thus, 
by Theorem 8, M,, is a prime ideal in #, if and only if M is a prime ideal 
in R. Thus if N is the radical of R, and p; are the prime ideals in R, we 
see that 

radical of Rn = (1) (Pi)n = (MPs) n = Nn. 


If R does not have a unit element, it is well known that we can imbed 
R in a ring § with unit element in such a way that FR is an ideal in S. Ii 
the radical of R.is N, and the radical of S is N’, then Theorem 4 shows that 
N=R/[)\N’. By the result just proved, the radical of S, is N’, and, since 
R,, is an ideal in S,, Theorem 4 shows that 


radical of R, = N’, 1) Rn= (N’ (11 
thus completing the proof. 


SMITH COLLEGE. 


REFERENCES. 


1. R. Baer, “ Radical ideals,’ American Journal of Mathemati»s, vol. 65 (1943), 
pp. 537-568. 

2. B. Brown and N. H. McCoy, “ Radicals and subdirect sums,” American Journal 
of Mathematics, vol. 69 (1947), pp. 46-58. 


r 


PRIME IDEALS IN GENERAL RINGS. 833 


3. H. Fitting, “ Primiérkomponentenzerlegung in nichtkommutativen Ringe,” Mathe- 
matische Annalen, vol. 111 (1935), pp. 19-41. 

4. O. Goldman, “ Addition to my note on semi-simple rings,” Bulletin of the 
American Mathematical Society, vol. 53 (1947), p. 956. 

5. N. Jacobson, “ Structure theory of simple rings without finiteness assumptions,” 
Transactions of the American Mathematical Society, vol. 57 (1945), pp. 228-245. 

6. , “ The radical and semi-simplicity for arbitrary rings,” American Journal 
of Mathematics, vol. 67 (1945), pp. 300-320. 

7. —-—, “A topology for the set of primitive ideals in an arbitrary ring,” Pro- 
ceedings of the National Academy of Sciences, vol. 31 (1945), pp. 333-338. 

8. G. Kéthe, “ Die Stvuktur der Ringe, deren Restklassenring nach dem Radikal 
volistindig reduzibel ist,’ Mathematische Zeitschrif!, vol. 32 (1930), pp. 161-186. 

9. W. Krull, “Zur Theorie der zweiseitigen Ideale in nichtkommutativen Be- 
reichen,” Mathematische Zeitschrift, vol. 28 (1928), pp. 481-503. 

10. —-——, “Idealtheorie in Ringen ohne Endlichkeitsbedingung,” Mathematische 
Annalen, vol. 101 (1929), pp. 729-744. 

11. J. Levitzki, “ On the radical of a general ring,” Bulletin of the American Mathe- 
matical Society, vol. 49 (1943), pp. 462-466. 

12, , “On three problems concerning nil-rings,” Bulletin of the American 
Mathematical Society, vol. 51 (1945), pp. 913-919. ; 

13. N. H. McCoy, “ Subdirect sums of rings,” Bulletin of the American Mathe- 
matical Society, vol. 53 (1947), pp. 856-877. 

14, ———, Rings and Ideals, Carus Mathematical Monographs, no. 8, 1948. 


SEMIGROUPS WITHOUT NILPOTENT IDEALS.* 
By A. H. CLirrorp. 


The purpose of the present paper is to extend the results of a previous 
paper [1] to semigroups having a zero element. It was shown there that 
if a semigroup S contains at least one left and at least one right minimal 
ideal, then it contains a unique minimal two-sided ideal Jt (the “ Susch- 
kewitsch kernel,” which we formerly denoted by K), and that 9 is a com- 
pletely simple semigroup without zero, in the sense of Rees [2]. If 8 has a 
zero element 0, then Jt = (0), and this result becomes trivial. We alter our 


definition of minimal ideal to mean minimal but + (0), and prove (Theorem* 


3.1 below) that if M is a minimal two-sided ideal of S containing minimal 
left and right ideals of S, and if S contains no nilpotent ideal ~ (0), then 
M is a completely simple semigroup (with zero) [3]. 

A simple but basic lemma in ring theory is that if Z is a minimal left 
ideal, then either L? = (0) or Z has an idempotent generator: I = Se (e? =e). 
This is not so for semigroups. An example is given by Baer and Levi [4] 
of a semigroup S which is left simple, hence is itself a minimal left ideal of 
S, but contains no idempotent element. But it is true for semigroups S 
having no nilpotent ideals (0), and in which every two-sided ideal contains 
minimal left and right ideals (Theorem 4.1). 

In the concluding section we discuss the connection between these results 
and S. Schwarz’s theory [5] of semigroups having a kernel Jt and radical #. 
Schwarz calls an ideal A of S “ M-potent ” if some power A” of A is contained 
in 9, and defines the radical 9 of S to be the sum of all the Jt-potent two-sided 
ideals of S. If KR is itself J-potent, then the difference semigroup S — as 
defined by Rees (loc. cit., p. 389) has no non-zero nilpotent ideals, and hence 
the foregoing results may be applied thereto (Theorems 5.2 and 5.3). These 
are compared with similar results found by Schwarz. Finally it is remarked 
that, under the assumptions made, ®t contains all nil-ideals of S. 


1. Simplicity of minimal two-sided ideals. As customary in the calcu- 
lus of complexes, the product A, A2- - - An of a finite number of subsets Aj 
of a semigroup S shall mean the set of all products a, a2- - -d» with a; in 
A; (i=1,2,---,n). In particular, A” shall mean the set of all products 


* ‘dn of n elements a; of A. 


* Received October 22, 1948. 
834 


an 


ide 


tha 


a 
i¢ 
rl 
A 
pe 
t [ 
Ww 
m 
If 
A 
of 
ide 
th 
M 
(E 
SE 
a 
ide 


SEMIGROUPS WITHOUT NILPOTENT IDEALS. 835 


A left [right] ideal of S is a non-vacuous subset A of S with the property 
SA CA [ASCA]. A set A which is both a left and a right ideal is called 
a two-sided ideal. The class sum of any number of left [right, two-sided] 
ideals is a left [right, two-sided] ideal, and the same is true of their 
intersection. 

Assume now that S contains a zero, i.e. an element 0 such that s-0 
=0-:s=0O for al] s in 8. Clearly § can contain only one such element. 
The single-element set (0) is a two-sided ideal of S contained in every left or 
right ideal of S. A left or right ideal A of S will be called nilpotent if 
{" = (0) for some positive integer n. We shall say that S is without nil- 
potent ideals if S contains a zero element but no nilpotent left or right ideal 
~ (0). (From Lemma 5. 2 below, this condition is satisfied if § contains a 
zero element but no nilpotent two-sided ideal  (0).) 

A left [right, two-sided] ideal of S will be called a minimal left [right, 
two-sided] ideal of S if it is (0) and contains no proper subset ~ (0) 
which is also a left [right, two-sided] ideal of S. If A and B are two distinct 
minimal left [right, two-sided] ideals of S, their intersection A f] B= (0). 
If A is a minimal two-sided ideal and B a minimal left or right ideal, either 
Af) B=(0) or BCA. 

THEOREM 1.1. Let S be a semigroup without nilpotent ideals. Then 
any minimal two-sided ideal of S is a simple semigroup. 


Proof. Let M be a minimal two-sided ideal of S, and suppose. (by way 
of contradiction) that B is a proper ideal (0) of M.. MBM is a two-sided 
ideal of S contained in B, hence properly contained in M, hence = (0) by 
the minimality of W. Then (MB)? = MBMB = (0), so that the left ideal 
MB of S is nilpotent. By hypothesis this requires MB = (0). Similarly, 
(BM)? = (0) and hence BM = (0). 

Now SBS is a two-sided ideal of S contained in M, so that either 
SBS = (0) or M. In either event, since MB = (0), we conclude that (SB)? 
= SBS:B=(0). Thus the left ideal SB of S is nilpotent, and hence 
SB=(0). Similarly, BS = (0). But then SBC B, BS CB, so that B is 
a two-sided ideal of S, contrary to the minimality of WM. 


2. Two-sided ideals containing minimal left ideals. 


LemMaA 2.1. Let S be a semigroup with zero 0. If L is a minimal left 
ideal of S, and c is any element of S, then either Lc is also a minimal left 
ideal of S or Lc = (0). 

Proof. Le is clearly a left ideal of S. Assume Le ~ (0), and suppose 
that B is a left ideal ~ (0) of S contained in Lc. Let L, be the set of all 


836 A. H. CLIFFORD. 


elements a, of Z such that a,ce B. L,>54(0) since otherwise B= (0). For 
any s in S, sa,ce B, so that sa,eL,. Hence L, is a left ideal ~ (0) of § 
contained in the minimal left ideal L. This requires L, = L, BD Lyc = Le, 
whence B = Le. 


THEOREM 2.1. Let S be a semigroup with zero 0. Let A be a two- 
sided ideal of S containing at least one minimal left ideal of S. Then the 
class sum B of all the minimal left ideals of S contained in A is a two-sided 
ideal of 8. In particular, if A is minimal, it is a sum of minimal left ideals 
of 8. 


Proof. As a sum of left ideals, B is a left ideal. To show that it is 
also a right ideal, we must show that if be B and ceS, then bee B. By 
definition of B, 6 must belong to some minimal left ideal LZ of S contained 
in A. beeLc, and, by Lemma 2.1, Lc is a minimal left ideal, evidently 
contained in A, or else Lc = (0). In either event Lc C B, and hence bce B, 
Clearly, if A is minimal, then A = B. 


THEOREM 2.2. Let 8 be a semigroup without nilpotent ideals. Let M 
be a minimal two-sided ideal of S containing at least one minimal left ideal 
of S. Then every left ideal of M is a left ideal of 8. 


Proof. We show first that any minimal left ideal Z of S contained in 
M is also minimal regarded as a left ideal of M. Suppose that B is a left 
ideal ~ (0) of M contained in L. MB is a left ideal of S contained in 
MLCL, whence MB=L or MB=(0). But MBCB, so that the first 
alternative would imply LC B, B=L, as desired. We may therefore 
assume MB = (0). Now SB isa left ideal of S contained in SL C L, whence 
SB=L, or SB=(0). The second alternative implies that B is itself a 
left ideal of S, whence B = L from the minimality of Z. We may therefore 
assume SB=L. Then L?=—SBSBCSMSB CMB = (0), contrary to the 
assumption that S contains no nilpotent left ideal (0). 

Now let A be any left ideal of M. If A= (0), A is clearly a left ideal 
of S, so we may assume A= (0). By Theorem 2.1, M is the sum of all the 
minimal left ideals of S contained in it. Hence each element a0 of A 
belongs to some minimal left ideal L of S contained in M. Lf) A is a left 
ideal +4 (0) of M contained in LZ. Since, as we have just proved, L is also 
minimal regarded as a left ideal of M, we conclude L{] A=L, i.e. LCA. 
Hence A is the class sum of all those minimal left ideals L of SCM for 
which L (] A (0). Being a sum of left ideals of S, A is itself a left ideal 


of S. 


tn 


4 
( 
C 
8 
° 


SEMIGROUPS WITHOUT NILPOTENT IDEALS. 837 


3. Complete simiplicity of minimal two-sided ideals containing both 
left and right minimal ideals. 


Lemma 3.1. Let S be a semigroup without nilpotent ideals, and let M 
be a mimmal two-sided ideal of S containing at least one left and at least 
one right minimal ideal of S. Then to each minimal left ideal L of S con- 
tained in M there corresponds at least one minimal right ideal R of 8 
contained in M such that LR=M and RL~(0). 


Proof. ML is a left ideal of S contained in ZL; hence MZ—L or 
ML = (0). The second alternative would imply L*? C MZ = (0), contrary 
to the assumption that S contains no nilpotent ideal 4 (0). Hence ML = L. 

LM is a two-sided ideal of S contained in M; hence LM—M or 
IM = (0). The second alternative is ruled out as above, so that LM WM. 

Now, by the left-right dual of Theorem 2.1, M is the class sum of all 
the minimal right ideals R of S contained in M. Were LR = (0) for every 
such R, we would conclude LM= (0), contrary to LM=—M. Hence 
LR =~ (0) for some minimal right ideal R of S contained in M. But LR 
is a two-sided ideal of SC M and ~ (0), whence LR = M. Were RL = (0) 
we would have T= ML = LRL = (0). 

The proof of the next sequence of lemmas, culminating in Theorem 3. 1, 
is the third edition of one due to R. H. Bruck and used by the writer in two 
previous papers [6]. To avoid repetition, let us state here the hypotheses 
we assume for them all. 


(1) S is a semigroup without nilpotent ideals. 


(2) M is a minimal two-sided ideal of S containing at least one left 
and at least one right minimal ideal of 8. 


(3) Land R are minimal left and right ideals of S contained in M, 
such that LR=M and RLS (0). 


LemMaA 3.2. If a is any element ~0 of R{) L, and b is any element 
of RL, then the equations ax = b and ya=b have solutions x, y in RL. 


Proof. Ma isa left ideal of S contained in LZ, since ae L, whence Ma = L 
or Ma= (0). The latter would imply that the two-element set (0,a) would 
be a nilpotent left ideal of M, hence of S by Theorem 2.2. Hence Ma=L. 
aR is a right ideal of S contained in R, since ae R, and hence ak = FR or 
aR=(0). The latter would imply, using Ma= and (3), that 
= MaR= (0). Hence ak =R. Consequently aRL = RL, so that ar = 6 is 
solvable for 7 in RL. 

Dually, aM is a right ideal of S contained in R. The case aM = (0) 


8 
C, 
ls 
is 
ry j 
d 
y 
n 
ft 
nl 
st 
e 
a 
| 
1e 
t 
30 
al 


838 A. H. CLIFFORD. 


is excluded as above, whence all’ —R. La is a left ideal of SCL. Were 
La = (0) we would have M = LR = LaM =(0). Hence La=L, RLa = RL, 
and ya = b is solvable for y in RL. 


LEMMA 3.3. RL is a group with zero. 


Proof. RL-RLCRL-LCRL, whence RL is closed. Since RL CR 
(| L, we see from Lemma 3. 2 that RL is a semigroup with zero in which the 
equations az = b and ya=b (a=0) are solvable for 2 and y. Hence, to 
show that RL is a group with zero, we need only show that if a0 and 
b+ 0 then ab0. By (3) there exists an element c0 of RL. Solve 
au=c for u, then be =u for x Then abr =~au—c. Were ab—O we 


would conclude that c = 0. 


LemMaA 3.4. Let e be the identity element of the group-with-zero RL. 
Then R=eS, L—Se, and R(\L=—eSe. 

Proof. Since ee Rk, eS is a right ideal of S contained in R. eS ~ (0) 
since it contains Hence eS=—R. Similarly Se=L. Then 
eSe = eS L. 

Lemma 3.5. R()L=RL. 


Proof. Clearly we need only show that ZC RL. Let ace Rf) L, 
and let e be the identity element of RL. If a—0 then ae RL. If a0 
then by Lemma 3. 2 we can solve av =e for xin RL. Let 2 be the inverse of 
xin RL. By Lemma 3.4, aeeSe. Hence a=ae = are? e RL. 


LemMMA 3.6. The identity element e of RL is‘a primitive idempotent. 


Proof. We are to show that the oniy idempotent elements f of S for 
which ef = fe=f are and f=—0. By Lemma:3.4 such an f belongs 
to R{) L, hence to RL by Lemma 3.5. RL is a group with zero (Lemma 3. 3), 
and so contains only the idempotents e and 0. 


THEOREM 3.1. Let S be a semigroup without nilpotent ideals. Let M 
be a minimal two-sided ideal of S containing at least one left and at least one 
right minimal ideal of S. Then M is a completely simple semigroup. 

Proof. That M is simple follows from Theorem 1.1. By Rees’ definition 
of complete simplicity (loc. cit., p. 393), we must show: 

(1) Corresponding to each a in M there exist idempotents e and f in 
M such that ea = af —a. 


(2) Every idempotent element +0 of WM is primitive. 


To show (1), let a be any element ~0 of M (the case a = 0 is trivial). 


S bd 


it 


be 
T 
ar 
m 
Si 
si 
ar 
ele 
de 
si 
po 
mi 
= 
y 
ex 
anc 
He 
eS 
sec 
ide 
mi 
Th 
tha 
| anc 
foll 


SEMIGROUPS WITHOUT NILPOTENT IDEALS. 839 


By Theorem 2. 1, a belongs to some minimal left ideal Z of § contained in M. 
By Lemma 3.1, there exists a minimal right ideal R of SCM such that 
LR=M and RLA(0). By Lemma 3.3, RL is a group with zero. Let f 
be the identity element thereof. By Lemma 3.4, 2 — Sf. Since ae L, af =u. 
The dual ea = a is proved similarly. 

To show (2), let e be an idempotent element ~ 0 of M. By Theorem 2. 1 
and its dual, e must belong to a minimal left ideal Z of SCM, and to a 
minimal right ideal R of SCM. Then e=eeeLR, so that (0). 
Since LR is a two-sided ideal C M and ~ (0), LR=M. Likewise RL (0) 
since it contains ee = e=~£0. Hence all three hypotheses prior to Lemma 3. 2 
are satisfied. By Lemma 3. 3, RL is a group with zero. ¢ must be the identity 
element of RL, and (2) follows from Lemma 3. 6. 


THEOREM 3.2. A simple semigroup is completely simple tf and only tf 
it contains at least one minimal left and at least one minimal right ideal. 


Proof. A simple semigroup S is a minimal two-sided ideal of itself, by 
definition of simplicity. Hence if it has the »roperty stated, it is completely 
simple by Theorem 3. 1. 

Conversely, let S be completely simple, and let e be a primitive idem- 
potent ~0 of S. The theorem will follow when we show that Se and eS are 
minimal left and right ideals of S, respectively. 

Suppose L is a left ideal 4 (0) contained in Se, and let a be an element 
#0 of LZ. Since § is simple, SaS = S, and we can solve cay =e for x and 
y in S. Since ae Se, ae—a. Since ex-a- eye = eee—e, we can assume 
x, ey=ye=y. Let f—yra. Then f? = yrayra = yera = yra 
and ef =fe—f. f~0 since y= ye = yray = fy. Since e is primitive, f = e. 
Hence e = yrae Sa C L, Se CL, Se = L, and Se is minimal. The proof that 
eS is a minimal right ideal is similar. 

The reader will observe that the only assumptions actually used in the 
second part of the proof were: (1) S is simple, (2) S contains a primitive 
idempotent e. From these assumptions, then, it follows that S contains 
minimal left and right ideals, and hence that S is completely simple by 
Theorem 3. Thus we have an independent proof of a theorem due to Rees 


that (1) and (2) imply complete simplicity [7]. 


4, Semigroups in which every two-sided ideal contains minimal left 
and right ideals. For the three lemmas of this section we assume the 
following hypotheses : 


(1) S is a semigroup without nilpotent ideals. 


( 


840 A. H. CLIFFORD. 


(2) Every two-sided ideal ~ (0) of S contains at least one left and at 
least one right minimal ideal of 8. 


Lemma 4.1. Every two-sided ideal ~ (0) of S contains a minimal two- 
sided ideal of 8. 


Proof. Let A be any two-sided ideal 4 (0) of S. Let B[C] be the 
sum of all the minimal left [right] ideals of SC A. By hypothesis (2), B 
and C are ~(0). By Theorem 2.1 and its dual, B and C are two-sided 
ideals of S. Applying (2) again to B, we see that B contains at least one 
minimal right ideal of S, and hence D = C+ (0). Clearly D is a two- 
sided ideal which is the sum of all the minimal left [right] ideals of S C D. 

By (2), D contains a minimal left ideal L of S. DZ is a left ideal of 
SCL. DL = (0) would entail L* = (0), contrary to (1), whence DL = 
LD is a two-sided ideal of SCD. LD (0) since otherwise L? = LDL 
= (0), contrary to (1). Since D is the sum of its minimal right ideals, it 
must contain a minimal right ideal R of S such that LR ~ (0). 

Since LR is a two-sided ideal (0) of S contained in D, and hence in 
A, the lemma will follow when we show that LR is minimal. Let H be a 
two-sided ideal ~ (0) of SC LR. Let Ir be an element 0 of H(le L, re R). 
Now SI=L. For SI is a left ideal of SC L, and were Sl = (0), the two- 
element set (0,1) would be a nilpotent left ideal of S. Similarly rS = R. 
Hence LR = SI-rS C SHS CH, whence H = LR. 


LemMaA 4.2. Hvery minimal left ideal of S is contained in some minimal 
two-sided ideal of S. 


Proof. Let LZ be a minimal left ideal of S. LZ \J LS is a two-sided ideal 
of S containing LZ, and we contend that it is minimal. Let B be a two-sided 
ideal ~ (0) of S contained in it. BL is a left ideal of SCL. BL= (0) 
would imply B? C B(L LU LS) = BL Ll BLS = (0), contrary to (1). Hence 
BL=L, and LCB. But then LU LS CB, whence equality follows. 


Lemma 4.3. LHvery left ideal ~ (0) of S contains a minimal left ideal. 


Proof. Let A be a left ideal A (0) of S. By Lemma 4. 1, the two-sided 
ideal AS (evidently (0) since A* 0) contains a minimal two-sided ideal 
M of S. Let m=—as be an element #0 of M(aeA;se8). The left ideal 
B= (a, Sa) consisting of Sa and the single element a, is contained in A, 
and Bs = (as, Sas) = (m,Sm) CM. Let C be the set of all c in S such 
that Be CM. C is clearly a right ideal of S containing the element s~0 
of 8S. BC is a two-sided ideal of SC M and ~(0) since it contains 
m=as>40. Hence BO=M. Now MB is a left ideal of S contained in 
M (|B, and ~ (0) since otherwise — MBC = (0). By Theorem 2.1}, 


nt 


le 
A 
m 
pe 
L 
a 
el 
L 
of 
A 
as 
be 
is 
If 
M 
C 
cas 
N- 


SEMIGROUPS WITHOUT NILPOTENT IDEALS. 841 


M is the sum of minimal left ideals of S, and hence MB must contain at 
least one minimal left ideal Z of S. The desired result then follows from 
ADBDIMBDIL. 


THEOREM 4.1. Let S be a semigroup without nilpotent ideals, and in 
which every two-sided ideal contains at least one left and at least one right 
minimal ideal of S. Then every left ideal LA (0) of S contains an idem- 
potent element e0. If L is minimal, L = Se. 


Proof. By Lemma 4.3, the given left ideal contains a minimal left 
it. Lof S. By Lemma 4.2, JZ is contained in a minimal two-sided ideal 
M of S. By Lemma 3.1, M contains a minimal right ideal R of S such-that 
[LR=M and RL (0). Hence the three hypotheses prior to Lemma 3. 2 
are satisfied. By Lemma 3.3, RL is a group with zero, and the identity 
element e thereof is a non-zero idempotent belonging to ZL. By Lemma 3. 4, 
L= Se. 

An ideal A (left or right) of S is called a nilideal if every element a 
of A is nilpotent, i.e. a” = 0 for some positive integer n. 


Corotuary 4.1. Under the hypotheses of Theorem 4.1, S contains no 
nilideal (0). 

Proof. Suppose A were a left nilideal 4 (0). Then by Theorem 4. 1, 
A would contain an idempotent e+40. No power of e is 0, contrary to the 
assumption that A is a nilideal. 


5. Semigroups with kernel and maximal ¥t-potent ideal R. Let 
be a semigroup in which the intersection Jt of all the two-sided ideals of S 
is not vacuous. Yt is the “ Suschkewitsch kernel” (Rees, loc. cit., p. 392). 
If S has a zero, 3} = (0). Otherwise Jt is a simple semigroup without zero. 

Following Schwarz (loc. cit., p. 39) a left or right ideal A of S is called 
N-potent if some power A” of A is C M. : 

Lemma 5.1. The sum of two N-potent left (or right) ideals is also 
N-potent. (Schwarz, Theorem 38, p. 39). 


Proof. Let A and B be %-potent left ideals, and let A™ CM, Bn CR. 
Let C=A\JB. Then C™" CM. For any product of m+n elements of 
C must contain at least m factors from A or at least n from B; in the first 
case it belongs to A”, in the second to B”. 

Lemma 5.2. Every Mt-potent left (or right) ideal is contained in an 
N-potent two-sided ideal. (Schwarz, Theorem 45, p. 51). 


Proof. Let L be an Y-potent left ideal, and let L” CM. Then 


842 A. H. CLIFFORD. 


(LS)"C CRS CR. LU LS is R-potent by Lemma 5. 1, and is a two- 
sided ideal containing L. 

Schwarz defines (p. 51) the radical R of S to be the sum of all the 
¥-potent two-sided ideals of S. SK is evidently a two-sided ideal of S, and 
by Lemma 5. 2 contains every 3t-potent left (or right) ideal of S. It need 
not be Yt-potent itself. We shall, however, impose the condition that Rt be 
§-potent; this is clearly equivalent to requiring that S contain a maximal 
¥t-potent two-sided ideal, which must of course be R. As an immediate con- 
sequence of Lemma 5. 2 we have: 


LemMaA 5.3. Let S be a semigroup having a kernel R and a mazimal 
M-potent two-sided ideal R. Then Kt contains every M-potent left (or right) 
ideal of 8S. 


Let A be a two-sided ideal of a semigroup S. Rees (loc. cit., p. 389) 
defines the difference semigroup 5S = S— A to be that obtained from § by 
collapsing A into a single zero element 0, while the remaining elements of § 
retain their identity. Thus there is a one-to-one correspondence s<§ 
between the elements s of S not in A and the non-zero elements § of 8, 
such that st <> sf if st ¢.A, and such that 50 if and only if ste A. This 
induces a one-to-one correspondence between the class of all left [right, 
two-sided] ideals B of S containing A and the class of all left [right, two- 
sided] ideals B of 8. 


THEOREM 5.1. Let S be a semigroup having a kernel N and a maximal 
N-potent ideal KR. Then the difference semigroup S= 8S —§ is a semigroup 
without nilpotent ideals. 


Proof. Suppose A is a nilpotent left ideal of §, and let An» (0). Let 
A be the left ideal of § containing # corresponding to A. Then A"C. 
But C for some m, by hypothesis, whence CM. Thus A is Yt-potent 
and so AC % by Lemma 5.3. But this entails 4 = (0) in the difference 
semigroup 8. 


THEOREM 5.2. Let § be a semigroup having a kernel NR and a maximal 
N-potent ideal R. Assume furthermore that every non-J-potent two-sided 
ideal of S contains at least one left and at least one right minimal non-f- 
potent ideal of S. Then every left (or right) non-N-potent ideal of S con- 
tains an idempotent element not in §. 


Proof. By Theorem 5.1, S=S—§ is a semigroup without nilpotent 
ideals. The translati- 1 to S of the stated minimality condition on S is simply 


th 


t 
] 
‘ 
| 
if 
Cl 
fe 
0 
a 
CC 
3. 
F 
fo 
if 
pa 
8 
se 
if 
as 
ch 
no 


SEMIGROUPS WITHOUT NILPOTENT IDEALS. 843 


that occurring in the hypothesis of Theorem 4.1. Let A be any non-Jt-potent 
left ideal of S. Then the corresponding left ideal A of 5 is 4(0). Hence 
by Theorem 4. 1, 4 contains an idempotent é40. The corresponding element 
ein A is then an idempotent not in §. 

We shall say that § satisfies the “descending chain condition for left 
[right, two-sided] ideals” if S does not contain an infinite sequence of left 
[right, two-sided] ideals A; (i—1,2,---) such that A, DA, 
This is equivalent to the “minimal condition for left [right, two-sided] 
ideals,” that every non-empty set of left [right, two-sided] ideals contain a 
minimal member. The “ascending chain condition” and the equivalent 
“maximal condition ” are defined in an analogous way. 


THEOREM 5.3. Let S be a semigroup satisfying the descending chain 
condition both for left and for right ideals, and the ascending chain condition 
for two-sided ideals. Then S has a kernel M, which is a completely simple 
semigroup without zero, and a maximal $-potent ideal KR. Every left ideal 
of S not contained in Ki contains an idempotent element not in Sf. 


Proof. From the descending chain conditions it is clear that S contains 
a single minimal two-sided ideal, i.e. that it has a kernel Jt, and that JM is 
completely simple without zero (Theorem 3.2, p. 525 of Ref. 1, or Theorem 
3.1 above with the appropriate change in the meaning of “ minimal”). 
From the ascending chain condition it is clear that S contains a maximal 
N-potent two-sided ideal #. Again from the descending chain conditions, 
the remaining hypothesis of Theorem 5.2 is satisfied, and the conclusion 


follows. 


Corotuary 5.3. Let § satisfy the conditions of Theorem 5.3. Then, 
if S contains no idempotent element outside of KR, S is itself M-potent. In 
particular, if S contains at most one idempotent, some power of S 1s a group. 


Proof. The first part is immediate from Theorem 5.3, which requires 
S=9. The second part follows from the fact that a completely simple 
semigroup without zero is a sum of groups, and reduces to a group if and only 
if it contains exactly one idempotent. 

This corollary is very much like Schwarz’s Theorem 42 (p. 46). Schwarz 
assumes that every element of S has finite order, but requires the descending 
chain condition only for left ideals. He notes that the finite order require- 
ment can be dropped if S is commutative, in which case there is of course 
no difference between left and right ideals. 

The ascending chain condition was used only to insure that # be ¥- 


844 A. H. CLIFFORD. 


potent, and so the latter condition may replace the former in the hypotheses 
of Theorem 5.3 and its corollary. The new corollary then bears the same 
resemblance to Schwarz’s Theorem 46 (p. 52) that the old one did to his 
Theorem 42. 

We conclude with a remark on a second possible definition (and there 
are doubtless others!) of the radical of a semigroup. Call a left [right] 
ideal A of S a left [right] nilideal if every element a of A is Y-potent, 
i.e. a"e MN for some n. The sum i* of all two-sided nil-ideals is a two-sided 
nilideal containing all left and all right nilideals. Evidently R* 0%. But 
if S satisfies the hypotheses of Theorem 5.2 (or 5.3) then R*—MR. For 
clearly no nilideal can contain an idempotent outside of St. 


THE JOHNS HOPKINS UNIVERSITY. 


REFERENCES. 


1. A. H. Clifford, “ Semigroups containing minimal ideals,’ American Journal of 
Mathematics, vol. 70 (1948), pp. 521-526. This paper was in turn an 
extension to the infinite case of certain results of A. Suschkewitsch, “ Uber 
die endlichen Gruppen ohne das Gesetz der eindeutigen Umkehrbarkeit,” 
Mathematische Annalen, vol. 99 (1928), pp. 30-50. 

2. D. Rees, “On semi-groups,” Proceedings of the Cambridge Philosophical Society, 
vol. 36 (1940), pp. 387-400; the definition is given on p. 393. 

3. Ina forthcoming paper, R. P. Rich points out that the condition that S contain no 
nilpotent ideal ~ (0) can be replaced by the much weaker and more 
appropriate condition M* ~ (0), throughout the first three sections of the 
present paper. 

4. R. Baer and F. Levi, “ Vollstandige irreduzibele Systeme von Gruppenaxiomen,” 
Sitzungsberichte der Heidelberger Akademie der Wissenschaften, Mathe- 
matisch-naturwissenschaftliche Klasse, Beitrdge zur Algebra, No. 18 (1932), 
12 pp.; the example is given on p. 7. 

5. Stefan Schwarz, “Zur Theorie der Halbgruppen,” Sbornik prac Prirodovedeckej 
fakulty Slovenskej univerzity v Bratislave, No. 6 (1943), 64 pp. (Slo- 

vakian, German summary); Mathematical Reviews, vol. 10 (1949), p. 12. 

6. A. H. Clifford and D. D. Miller, “ Semigroups having zeroid elements,” American 
Journal of Mathematics, vol. 70 (1948), pp. 117-125. Bruck’s proof is used 
in § 2 of this paper and again in § 3 of ref. 1 above. 

7. D. Rees, “ Note on semi-groups,” Proceedings of the Cambridge Philosophical Society, 

vol. 37 (1941), pp. 434-435. 


as 


ay 


THE RADIUS OF UNIVALENCE OF AN ANALYTIC FUNCTION.* 


By ZeEv NEHARI. 


A family of analytic functions f(z) which is normal and compact in a 
domain PD and satisfies f’(é)| =C>0 at a point € in D, has a circle of 
univalence about é, i.e. there exists a positive number r such that all func- 
tions of the family are schlicht in the interior of the circle |z—é| es. 
While the existence of a circle of univalence follows easily from the theory 
of normal families, the determination of the “exact” radius of univalence 
of a given family of functions may be a very difficult task. In the case of 
the family of bounded functions in the unit circle, the exact radius of uni- 


valence was found by Landau [3]. 

In the present paper, we shall determine the exact radii of univalence 
of two families of functions which are defined in an arbitrary bounded 
domain D of finite connectivity. 

Let S denote the family of functions f(z) for which the function 
log [f(z) — f(é)]/[z—€] is regular and single-valued in D, where ée D is 
a given fixed point of D. For the sake of convenience we shall further 
assume that €=0. f(z) being in S, this imphes that also f(0) =0. 
Obviously, this assumption is no restriction on the generality of our 
considerations. 

We begin with the following theorem: 


TueorEM I. Let f(z) be in S and let M =1.u.b. | log [f(z)/z] | ; then 
zeD 


f(z) is schlicht in the largest circle about the origin all of whose points satisfy 


(1) | 2 | K(z, 2) S2n/M. 


Here, K(z,é) denotes the Szegé kernel function [2,4] of D. This value of 
the radius of univalence is the best possible. 


Proof. By hypothesis, log [f(z)/z] is regular and single-valued in D. 


We therefore have, by the residue theorem, 


(2) f'(@)/f(2) J, log (f(z) /x)q(2)de, 


* Received August 18, 1948. 
845 


e 
is 
€ 
? 
d 
t 
r 


846 ZEEV NEHARI. 


where q(z) is an arbitrary single-valued function in D which is regular in 
D-+T except at the point =z where it has the principal part («—z)~. 
In order that the integration be permissible, we assume further that the boun- 
dary T of D consists of smooth curves and that f(x) is continuous on T; once 
the result is obtained, both assumptions can easily be disposed of. Indeed, 
if D is not smoothly bounded, we may approximate D by a sequence of 
domains D, which satisfy D, C D, Dx © Dns, lim Dx = D, and whose boun- 


daries T,, are smooth. If we replace D by Dn, the additional assumptions 
under which we prove Theorem J are satisfied. The general result then 
follows by letting oo and observing that 1. u.b.| log z“f(z)| for 
ze Dn, and that the Szegd kernel function K(z,z) is a continuous domain 


function. 
The particular function g(x) in (2) we shall choose is connected with 


the function F(z) = F (2,2), introduced by Ahlfors [1], which solves the 
following extremal problem: Among all functions g(z) which are regular 
and single-valued in D and satisfy there | g(x)| = 1, g(z) =0, to find the 
function which maximizes | g’(z)|. If nm is the connectivity of D, it was 
shown by Ahlfors that F(x) maps D onto the n-times covered unit circle. 
More recently, it was shown by Garabedian [2] that there exists a function 
q(x) which is regular in D + T except at the point x == z, where its principal 
part is (2 —2z)-, and which has the property 


(3) (1/i)F(x)q(x)dz > 0, set. 
We now identify g(x) in (2) with this particular function. By (2), 


we have 

T= | —1/2| | 
Since, on I, | F(x)| = 1, this may also be written 

I< iM F(x)q(2)dz| f F(2)q(zx)de, 

2r r 
the last identity following from (3). Using the residue theorem, we therefore 
obtain | f’(z)/f(«) —1/z| MF’(z) or 
(4) | (2)/f(z) —1| SM |2| 

A sufficient condition for f(z) to be schlicht in a circle C about the 

origin is 


(5) Re{zf’(z)/f(z)} =0 


TH 


( 

E 


THE RADIUS OF UNIVALENCE. 847 


in C; (5) even assures that C be mapped by f(z) onto a star-like domain. 
(5) is obviously satisfied if | zf’(z)/f(z) —1|<1. In view of (4), a circle 
about the origin all of whose points satisfy 


(6) M|z|F’(z)S1 


will therefore be mapped by f(z) onto a schlicht star-like domain. 
It was shown by Garabedian [2] that 27F’(z) = K(z,z), where K(z, é) 
denotes the kernel 


K (2,6) = 


of a complete system of regular functions in D which is ortho-normalized by 
the conditions 


(6) may therefore also be written | 2|K(z,z) S2/M, which is identical 
with (1). This shows that the circle defined in the statement of Theorem f 


is indeed mapped by f(z) onto a schlicht domain. 
In order to show that this is the best possible lower bound for the radius 


of univalence, we consider the function 
fo(z) =zexp {— Me F(z, re‘7) }, 
where z= re‘? is the point nearest to the origin among those points z for 
which equality holds in (1). Obviously, 
l. u. b. | log fo(z)/2z | = M. 
zeD 


Setting fo(z) =f(z) in (2), we obtain 
— 1/(retv) = — Me™. 
Since for z = re‘Y we have equality in (1), we have therefore 
fore”) /fo(ret) ——1/(ret). 


whence f’)(re'V¥) =0. This shows that f(z) cannot be schlicht in a circle 
about the origin with a radius greater than r. 

As an example we consider the case where D is the unit circle | z| <1. 
In this case, K(z,2z) = 2r(1—]|z|*)* [4]. By Theorem I, the radius of 
univalence of the family of functions for which | log [f(z)/z] |< M in 
|z| <1 is therefore r= 4{(M?+ 4)3—M}. The exact value of this radius 
of univalence is attained by the function 


ZEEV NEHARI. 


fo—2exp {—M([z—r]/[1—r2])}. 


The next theorem gives the circle of univalence of a function f(z) in 
S, when instead of J a positive number WN satisfying 


(7) <|f(z)/ze| <e% 


is known. Since the M of Theorem I can be used as an N but not vice versa, 
this constitutes a stronger result. 


TueEorEM II. Let f(z) be in S and let N be defined by (7) ; then f(z) 
is schlicht in the largest circle about the origin all of whose points satisfy 


(8) | 2 | K(z,2) <7°/(2N), 


where K(z,€) is again the Szego kernel. This lower bound for the radius 
of univalence is the best possible. 


Proof. We start again with the identity (2), but we now identify q(2) 
with a different function. If ’ (2) = F(2,z) denotes again the same function 
as above, we find, in view of | F(x)|=1, weT, that 


6F (x)/{1 + (zx) } (|@|=1) 
is real on T. In view of (3), the differential 
(9) (1/i) qi(v) da = (1/10) [1 + 06°F? (x) ]dx 


will also be real on The principal part of g,(x) at is 0" (x-—z)~. 
We may therefore use g,(x) instead of g(a) in (2). This yields 


O* (f(z) /f(z) —1/z) = log(f(x)/x) [1 + (x) ]q(x) dx. 


Taking real parts, we obtain, in view of the reality of the differential (9}, 


Re{1/0(f’(z)/f(z) —1/2)} = f log | f(x) /x | [1 + @F?(x) ]q(z) da, 
whence 


(10) | —1/2)} |= f | [1+ |. 


We cannot drop now the absolute value signs at the right-hand side and use 
the residue theorem, since the differential (9), while real, is not of constant 
sign. This difficulty can, however, be overcome in the following way. Writing 


q(x) da/i = [F(x)q(x)dx/i][1 + (x)], 


848 
| 
| 
T 
t] 
th 
W 


THE RADIUS OF UNIVALENCE. 849 
we see that qi(x)dz/i is positive for — 7/2 < arg OF (x) < and negative 
for 7/2 < arg OF (x) < 37/2. Since, if 

Gy(x) = [1 + OF (x) ]/[1 — i6F(2)], 


arg Gg(x) is equal to 1 for < arg (x) < 7/2 and equal to —1 
for 7/2 arg OF (x) < 37/2, we have 
| de/i | — [1 + 6°F2(x)] arg Gy (x) de 
= [1 + #F? (x) | Im log Gy (x) dz 
= Im{ (4716)-19 (x) [1 + @F? (x) | log Gg(x) dz}. 


Using (10), we thus obtain 
| Re 6*{f’(z)/f(z) —1/z} | Im (207 q(x) [1 + 6F? (2x) ] log Gy (x) de. 
T 


While the integrand has singularities on T, it is continuous there. We may 
therefore evaluate the integral on the right-hand side by the residue theorem. 


This yields 
| Re (f’(z)/f(z)) —1/z2| S Im (z)} = (z), 


or, because of the arbitrariness of the argument of @, 


F’(z) = (2N/n’) 


z| K(2;z). 


| (2)/f(z) —1| = (4]2 
For values z which satisfy (8) we shall therefore have 
| /f(z) —1|S1. 


Since such values satisfy a fortiori the inequality (5), a circle about the 
origin all of whose points z are subject to (8) will be mapped by f(z) onto 


a schlicht star-like domain. 
In order to show that this lower bound for the radius of univalence is 


the best possible, we denote by re‘’ the point nearest to the origin for which 
equality holds in (8) and consider the function 


fo(z) = 2([1—te F(z) ]/[1 + te F(z) = F(z, ret). 


Clearly, 
oN | (Se, zeD. 


We have 


850 ZEEV NEHARI. 
1/r— /fo(ret7) 

— {log ([1— ie F(@)]/[1 + F(@)]) (A + 

r 

= (4/x) NF’ (ret”) = (2/2?) NK =1/r, 
the last step following from the fact that, for z = re‘’, equality holds in (8). 
Hence, = 0, which shows that f,(z) cannot be schlicht in a circle 
about the origin of radius larger than r. This completes the proof of Theorem 

As an illustration, we again consider the case where D is the unit circle. 

Since, in this case, K(z,z) =2r(1—|z|*), we have the result: If 


z1f(z) in |z| <1 and eN<|z|-+| f(z)| Se, then f(z) is schlicht 
in the circle |z| <r with 


+ 2N]. 
The exact value of this bound is attained by the function 
fo(z) = + ir —iz(1 — ir) ]/[1 — ir iz(1 + ir) 


It is worthy of note that, as shown by a trivial modification of the 
proof, the lower bound for the radius of univalence given by Theorem II 
remains also valid if the condition | Re {log z*f(z)} |= N is replaced by 


| Re log [f(z)/z]} | 


where B (0 = 8 < is arbitrary. In particular, the constant in Theorem 
II may be so chosen as to satisfy 


| arg f(z) —argz|<N, ze D. 


By a slight modification of the methods of proof used above it is possible 
to obtain two sharp estimates for the radius of convexity of the family T of 
functions for which log f’(z) is regular and single-valued in D. Here, the 
radius of convexity of a function f(z) about z= 0 is defined as the radius 
of the laregst circle about the origin which is mapped by f(z) onto a convex 
schlicht domain. We have the following two results. 


THEOREM III. Let f(z) be in T and let 
| log f(z)| = M, zeD; 


let further r denote the radius of the largest circle about the origin all of 
whose points satisfy 


wh 


ma 


Thi 


( 
0 
( 
18 
of 
th 
(1 
(1 
W 
(1 
In 
(15 
will 


THE RADIUS OF UNIVALENCE. 851 
| 2| K(z,2) S2a/M; 


then the circle |z| <r is mapped by f(z) onto a convex schlicht domain. 
This lower bound for the radius of convexity is the best possible. 


THEOREM IV. Let f(z) be in T and N such that either 


(11) eN <|f'(z)| 
or 
(12) | arg f’(z)| 


is satisfied. If r denotes the radius of the largest circle about the origin all 


of whose points satisfy 
|2| K(2z,z) S4r/N, 


then the circle |z| <r is mapped by f(z) onto a convex schlicht domain. 
This bound is again the best possible, irrespective of whether N 1s defined by 
(11) or (12). 


We have 
(13) (2) = ae, 


where g(z) has the same meaning as in (2). Hence, 


F(x)q(x)de =MF’(z) = 1MK(z,2), 
r 2r 


where F(x) again denotes the Ahlfors extremal function. 
A necessary and‘sufficient condition for a circle about the origin to be 
mapped onto a convex schlicht domain is 


1+ Re {zf"(2)/f’(z)} 0. 
This condition is certainly satisfied if 
| ef’ (2)/f' (2) 
In view of (14), any circle about the origin all of whose points satisfy 
(15) |2|K(z,2) S22/M 


will therefore be a circle of convexity of f(z). 


852 ZEEV NEHARI. 


In order to show that this bound is the best possible, we denote by re‘v 
the nearest point to the origin for which equality is attained in (15), and 
consider the function 


fo(z) = {— (2, }dz. 
“0 


Substituting f,(z) for f(z) in (13), it is readily found that 


f’ (ret”) /f’ (ret7) + Me'VK (re, = — /r, 


the last step following from the fact that, for z = re’, equality holds in (14). 
Hence, 


1+ = 0, 


and this shows that a circle of radius larger than r cannot be mapped by f(z) 
onto a convex domain, since in the vicinity of re‘? there must be points z 
for which 


1 + Re {2f"(z)/f'(z)} <0. 


The proof of Theorem IV and the construction of the extremal functions 
follow in a similar way by making the necessary modifications in the proof 
of Theorem IV. We omit them here, since they present no new features. 


HARVARD UNIVERSITY AND 
WASHINGTON UNIVERSITY. 


REFERENCES. 


1. L. V. Ahlfors, “ Bounded analytic functions,” Duke Mathematical Journal, vol. 14 
(1947), pp. 1-11. 

2. P. Garabedian, Schwarz’s lemma and the Szegé kernel function, Thesis, Harvard 
University, 1948. 

3. E. Landau, “ Der Picard-Schottkysche Satz und die Blochsche Konstante,” Sitzwngs- 
berichte, Akademie der Wissenschaften, Berlin, Physikalische-Mathematische 
Klasse, 1926, pp. 467-474. 

4. G. Szegé, “ Ueber orthogonale Polynome, die zu einer gegebenen Kurve der kom- 

plexen Ebene gehoeren,” Mathematische Zeitschrift, vol. 9 (1921), pp. 218- 

270. 


St 


| 
( 
| 
1 
| 
I 
0 
t 
( 
a 
( 
fi 
( 
T 


ON LINEAR ASYMPTOTIC EQUILIBRIA.* 


By AUREL WINTNER. 


For large positive ¢, say for 
(1) 0St< oa, 


let (@ix), Where 1 Stn, 1SkSn and be a matrix of n? 
continuous functions. Conditions will be considered under which this matrix 
has the following property: All n components, 27,(¢),- + -,%n(t), of every 
solution of the linear differential system 


n 
(2) vi’ = 3 aix(t) ax 
k=1 
tend, as {> oo, to finite limits and, if ¢,,- --,¢, are arbitrary constants, 


(2) has a solution corresponding to which the n limits 2;(0) become the 
respective given values ¢j. 

If this is the case, let (2) be called of type (*). The simplest example 
of a system of type (*) is the trivial system 2,’ = 0, with z,(t) = ¢1,- +--+, 2n(t) 
=c, as its general solution. To say that (2) is of type (*) means that the 
solutions of (2) are in asymptotic one-to-one correspondence with the solutions 
of the trivial system, where (aix) = (0). 


(1) Let each of the n* continuous functions ai.(t) satisfy the following 
three conditions: The integral 


T 


(3) ( aix(t) dt is convergent in the sense f = lim f 


and, when considered as a function of the lower limit of integration, behaves 
so as to satisfy 


(4) fi | 00 ; 


finally, 

(5) lim sup | aiz(t)| << ©. 
t—> 

Then (2) is of type (*). 


* Received October 18, 1948. 


| 
5 
d 
8 
853 


854 AUREL WINTNER. 


Easy examples show that (3), (4), (5) together fail to imply 


(6) | au(t)| dt < 0. 
On the other hand, while (6) implies (3), neither (4) nor (5) follows from 


(5). Hence, (I) neither contains, nor is contained in, the following criterion: 


(II) Let each of the n? continuous functions aix(t) be such as to satisfy 
(6). Then (2) is of type (*). 


Still another criterion runs as follows: 


(III) Let each of the n? continuous functions ay,(t) satisfy the 
following three conditions: (3), 


(4 bis) f if dix(t)dt |? ds < 
and 
(5 bis) Jf edt < 


‘where p> 1 is a fixed index and q= p/(p—1). Then (2) ts of type (*). 


Clearly, (I) and (II) can be interpreted as limiting cases, p—1 and 
p=, of (III). : 

(II) is known. The proof of (I) will be more elaborate. It will be 
given in a form which will make it clear that the proof of (III) is exactly 
the same as that of (I), except that the applications of the “ first mean-value 
theorem,” to be based on (4) ana (5), must be replaced by corresponding 
applications of Holder’s inequality. 

The proof of (I) proceeds as follows: 


If A denotes the matrix (aix), and x the column vector in which the 1-th 
component is z;, then (2) can be written as 


(7) = A(t)2. 


1The result goes back to Bécher; cf. O. Dunkel, “Regular singular points of a 
system of homogeneous linear differential equations of the first order,” Proceedings of 
the American Academy of Arts and Sciences, vol. 38 (1902), pp. 341-370, where the 
independent variable corresponds to e-t, if t is the independent variable used above. 
For a direct proof, cf. A. Wintner, “Small perturbations,” American Journal of Mathe- 
matics, vol. 67 (1945), pp. 417-430 (more particularly p. 428). 


i 
( 
( 


ON LINEAR ASYMPTOTIC EQUILIBRIA. 855 


Conditions (3), (4), (5) respectively mean that, on the half-line (1), 


(8) B(t) = f A(s)as, where =lim |, 
t t t 

defines a matrix function, B(t) ; that 

(9) = f | B(s)| ds 


t 


defines there a scalar function, b(t) ~ oo; and that there exists a constant, 


B, satisfying 
(10) | A(t)| SB. 


In (9) and (10), the sign oi absolute value is meant in the following sense: 
If C is a matrix, | C’| denotes the maximum of | Cv| when | v| 1, where 
| Cv | denotes the length of the vector Cv into which C transforms a vector v of 
length | v| =1. Thus | C, + and | C,C.| S| Ci | | C2j. 
Let y(t) be any vector function which is continuous on (1) and satisfies 


(11) lim sup | y(t)| < 0. 
to 0 


Then it is seen from the convergence of the integral (8) that (11) remains 
true if y(t) is replaced by B(t)y(t). Similarly, (11), (10) and the conver- 
gence of (9) imply that the integral 


f B(s)A(s)y(s)ds 


is (absolutely) convergent and, when denoted by z(t), is such that (11) 


remains true if y(t) is replaced by z(t). 
In view of these two facts, it is possible to define on (1) a sequence of 
continuous vector functions 2°(t),z'(t),- - - by the recursion formula 


(12) 2™1(t) —c— B(t)a™(t) =| B(s) A(s)x™(s) ds, 


where c is an arbitrarily chosen constant vector and 


(13) =c. 


| 
a 
t 
t 


856 AUREL WINTNER. 


If m in (12) is replaced by m—1 and the resulting relation is sub- 


tracted from (12), it is seen that 
pines (t) S| pm(t) + | B(s)| | ACs) | mm (s) as, 
t 


where pm(t) =| 2"(t) In particular, since the function 
(14) Am(t) = fin sup | 2”(s) —2"-1(s) | 
tSs<« 


is a non-increasing majorant of p(t), 
pner(t) S| Am (t) + A(t) | B(s)| | A(s)| as. 


t 


Hence, by (10) and (9), 
pnsa(t) (| B(t)| +b(t)B) An (2). 


Since @ is a constant, the convergence of (8) and (9) implies that, from 
a certain ¢ onward, | B(t)| + 0(t)B < 4}. It follows therefore from the last 
formula line that, if ¢) is large enough and ; 


(15) < 


then Am(t) << gAm(t), hence Ani (t) < Ar (t)/2". Consequently, if it is 
ascertained that A,(¢) > 0 as t—> oo, it will follow that, on the half-line (15), 
the inequality | 


(16) Am(t) < /2™ 


holds for a function «(t) which does not depend on m and tends to 0 as t > 0. 
But (13), the case m=O of (12) and the case m=—1 of (14) show that 
di(t) +0 is equivalent to 


—B(te— B(s)A(s)cds—>0, (t— 


and the truth of the latter limit relation, in which c is a constant, is clear 
from (10) and from the convergence of both integrals (8), (9). This proves 
that «(¢) +0 in (16). 

It follows from (14) and (16) that, on the half-line (15), a vector 
function z(t) can be defined by placing 


x(t) =2°(t) +3 —x™(t)}. 


It 
sol 


Ac 


wh 
tha 


i 
( 
4 al 
in 
( 1 
wl 
te! 
pe 
set 

t 
|_| 
nf | 


ON LINEAR ASYMPTOTIC EQUILIBRIA. 857 


The convergence of this series is uniform on (15) ; so that, since every 2”(t) 
is continuous, z(t) is. Furthermore, by (13), 


(17) u(t) >c as toon, 


since S«(¢)/2"—e«(t) +0. Finally, since z(t) =lim2™(t), 
m=1 


(18) u(t) —=c—B(t)2(t)— B(s)A(s)x(s)ds. 


In fact, (10) and the convergence of (9) imply that the limit process, m —> o, 
can be carried out beneath the integral sign of (12), since, in view of (14) 
and (16), the functions x(t) are uniformly bounded on the half-line (15). 

It will be shown that, if ¢ is large enough, the integral equation (18) 
implies the differential equation (7). If this is granted, it will follow that 
(7) has a solution satisfying (17), where ¢ is any given constant vector. 


This will complete the proof of (I). 
In fact, let ‘e be the column vector corresponding to which the matrix 


(7e,- - -,™e) becomes the unit matrix, and let ‘x(t) denote the solution x(t) 


which the above construction supplies when c= ‘e. Then det (*a,- 
tends to 1 as t—» «©. Hence, the n solution vectors ‘x(t) are linearly inde- 


pendent. Consequently, if x(t) is any solution of (7), there exists a unique 


set of n scalar constants y; satisfying 
n 
a(t) = ‘x(t)yi. 
i=1 


It follows therefore from lim ‘zr(t) =‘e that limz(t) exists for every 


solution x(t), the vector ¢ in (17) being 


C= > eyi. 
i=1 


Accordingly, (7) is of type (*), as claimed in (I). 
In order to prove that (18) is just a transcription of (7), put 


(19) C(t) = E+ B(t) 


where Z is the unit matrix. Then the convergence of the integral (8) implies 
that C(t) > FE as t— co. Hence, det C(t) 0 on the half-line (15), if t, 
is large enough. Thus C-1(t) exists on (15). 


t 


858 AUREL WINTNER. 


According to (18) and (19), 
(20) C(t)a(t) =c— f B(s)A(s)x(s)ds. 


Since z(t), A(¢) and (8) are continuous functions, the vector on the right 
of (20) has a first derivative. It follows therefore from (20) that the same 
is true of the vector C(t)x(t). Consequently, the same is true of the vector 
z(t) if it is true of the matrix C(t). But C-*(t) has a derivative if C(t) 
does ; in fact, 

(C+)’ =— if det C40,C = C(t). 


Since (19) and (8), where A(¢) is continuous, imply the existence of C’(t), 
the existence of 2’(t) follows. 
Consequently, (18) can be differentiated. This gives 


a(t) = — B’(t)x(t) — B(t)a2’(t) + B(t)A(t)a 
But B’(t) =— A(t), by (8). Hence, if the first term on the left of the last 
formula line is moved to the right, it is seen that w(t) —— B(t)w(t), where 


w(t) is an abbreviation for 2’(¢) —A(t)a(t). Accordingly, C(t)w(t) =0, 
by (19). It follows therefore from det C(t) 40 that w(t) —0. In view 
of the definition of w(t), this proves that z(¢) is a solution of (7). 


THE JOHNS HOPKINS UNIVERSITY. 


equ 


t 

the 

(1) 
whe 
| pow: 

| for 
class 
func 
post 

(2) 
whic 
tions 

(3) 
whic 

a ha 
class: 

hard 

| 


ON THE CLASSICAL EXISTENCE THEOREM OF LINEAR 
DIFFERENTIAL EQUATIONS.* 


By HartMANn and AUREL WINTNER. 


1. In the classical existence theorem of systems of linear differential 
equations 


dwi/dz = Sgin(2) wr, 1,-+-,n), 
k=1 


the coefficient functions are supposed to be regular power series, convergent 
in a circle about z—0. If z—e-*, the system appears in the form 


(1) dwi/ds = 3 gix(s) ws, 
k=1 


where the coefficient functions are convergent series in positive powers of e-°, 
power series a,e* + a,e-** +-- - - having no constant terms (and convergent 
for large Rts). It was shown in [2] that, due to this circumstance, the 
classical existence theorem can be extended to the case in which the coefficient 
functions of (1) are absolutely convergent Laplace-Stieltjes integrals with a 
positive lower limit of integration, 


(2) J  e-#\da(A), do > 0, 
do 


where s =o + tt and o > const. 


The present note deals with another extension, namely, with the case in 
which the coefficient functions are uniformly almost-periodic (regular) func- 
tions on a half-plane o > const., with Fourier expansions of the form 
(3) Am 2 > 


m 


which need not have a half-plane of convergence (and still less, as in (2), 
a half-plane of absolute convergence). While both extensions contain the 
classical theorem, neither can be deduced from the other and it seems to be 
hard to formulate one theorem containing both extensions. 


* Received November 5, 1948. 
859 


it 
e 
) 
st 
,| 
| 


PHILIP HARTMAN AND AUREL WINTNER. 


It may be mentioned that the analogue of existence theorems in question 
is not likely to be true in case of integrals (2) having half-planes of conver- 
gence but no half-planes of absolute convergence. 


2. Let a function f(s) be called uniformly almost-periodic in the closed 
half-plane o = 0 if, on the one hand, f(s) is regular and (in the usual sense) 
uniformly almost-periodic on the open half-plane o > 0, and, on the other 
hand, f(s) goes over on the line o = 0 into a continuous boundary function, 
f(it), which is uniformly almost-periodic (for — 0o<t< o), while the 
uniform almost-periodicity of the functions f(o + it) of the real variable ¢ 
is uniform in o (as 0). 

In this terminology, the theorem in question can be formulated as 


follows: 


THEOREM. In the half-plane o = 0, let the n* coefficient functions of 
(1) be uniformly almost-periodic, with Fourier expansions, 


(4) fin (8) ~ 


in which the exponents, Xm, have a positive lower bound. Then every com- 
ponent, wi(s), of every solution, -,Wn), of (1) ts uniformly almost- 
periodic in the half-planeo=0. Furthermore, the n integration constants, 
ci, specifying a solution of (1) can be chosen to be the constant terms of the 
Fourier expansions of wi(s), and any sel of n assigned mean-values, Ci, 
determines a solution (w,,-*-*3Wn) uniquely. Finally, the non-vanishing 
exponents occurring in the Fourier expansions of any of the n components 
of any solution are linear combinations, with positive integral coefficients, of 
a finite number of the exponents Am occurring in (4). 


Let the coefficient matrix, (fix), of (1), be denoted by F (and, corre- 
spondingly, the matrix of the Fourier expansions (4) by 


(5) F(s) ~3 Anes, 


where (a) —=A,). Then (1) can be written in the form 
(6) dW/ds = F(s)W, 


if W denotes a matrix, W = (wix), in which each of the n columns represents 
a solution vector of (1). 

If W(s) is a fundamental matrix of (1), i.e., a matrix formed by 
linearly independent solution vectors, the most general fundamental matrix 
of (1) is W(s)C, where C is any constant matrix of non-vanishing deter- 


860 

e 

( 
Sl 

( 
( 

( 
an 
(1 

fu 

(1 


LINEAR DIFFERENTIAL EQUATIONS. 861 


minants. Hence, all but the last of the assertions of the Theorem will be 
proved if it is shown that (6) has a solution W(s) which is uniformly almost- 
periodic in the half-plane o=0 and is such that the Fourier expansion of 


(7) W(s)—£E (£ =the unit matrix) 


contains no constant term. (The truth of the last assertion of the Theorem, 
i.e., of the assertion that every exponent of the Fourier expansion of (7) is 
a linear combination, with positive integral coefficients, of the exponents 
occurring in (5), will be clear from the proof below.) 

It can be assumed that 
(8) Am 21 


in (5). For, on the one hand, it is assumed that An = p holds for every m 
and for some p > 0, and, on the other hand, the assumptions and the asser- 
tions of the Theorem remain unaltered if s is replaced by s/p, where p is any 
positive constant. 


3. The comments made after (3) imply that the proof of the Theorem 
must involve deep function-theoretical properties of the uniformly almost- 
periodic functions of a complex variable. The properties in question will 
enter via the following Lemma: 


Lemma. On the half-plane o20, let f(s) be a uniformly almost- 
periodic function, with a Fourier expansion 


(9) f (8) ~ 
satisfying 
(10) bm =v for some v> 0. 


Then f(s) is bounded, say 
(11) | f(s)| Se for 


and there exists on the half-plane o = 0 a uniformly almost-periodic function, 
f*(s), having the Fourier expansion 


(12) (s) ~—> Ampem ms 
m 


and satisfying the inequality 


furthermore, 


(14) df*(s)/ds =f(s) if o>0. 


x 


862 PHILIP HARTMAN AND AUREL WINTNER. 


First, the boundedness of (9) in the half-plane o = 0, i.e., the existence 
of ac< o satisfying (11), follows from the assumption that f(s) is almost- 
periodic on the closed half-plane o=0 and from (9) and (10). In fact, 
this is contained in a theorem of Bohr (cf. [1], pp. 150-151). Next, (11), 
(10) and (9) imply that 
(15) | f(s)| Sce* if 


(cf. [1], p. 152; although the wording is slightly different there, it follows 
from the Phragmén-Lindeléf principle that (15) is equivalent to what is 
deduced loc. cit.). 

Since f(s) is continuous for o=0 and regular for o > 0, it follows 
from (15) that it is possible to define for o= 0 a function, f*(s), which ‘s 
continuous for ¢= 0 and regular for o >0, by placing 


If the integration path is chosen to be parallel to the real axis, then (13) 
follows from (15) and (16). 

Since (14) is clear from (16), all that remains to be ascertained is that 
the function f*(s), defined by (16), is uniformly almost-periodic for o = 9, 
with (12) as its Fourier expansion, if (9) is uniformly almost-periodic for 
o=0 and satisfies (10). But this can be concluded from the Bohl-Bohr 
theorem (cf. [1], p. 7 and p. 152). In fact, it follows from (15) and 
Cauchy’s theorem, that the integration path in (16) can be chosen to be 
J+ L, where J denotes the segment joining with where 
s=o+ it, and L denotes the real half-line oS < ~; cf. [1], p. 1538. 


4. Due to the function-theoretical facts contained in the Lemma, the 
proof of the Theorem can be carried out so as to become formally about the 
same as the proof given in [2] for the case in which (2) takes the place of 
(3). In fact, the proof can now be based, as in [2], on successive approxi- 


mations, as follows: 
Under the assumption that the matrix (5) is uniformly almost-periodic 


for c= 0 and satisfies (8), put 
(17) Vo(s) 


where E is the unit matrix, and let the matrices V,(s), Vo(s),- - - be defined, 
save for additive constants, by the recursion formula 


(18) dV (8) /ds = F(s) Vin(s). 


| 

a 

( 

( 
I 
0 
a 

t] 
( 

| fi 
t 
i 
F 
fo 

( 
th 
th 
pr 
if 


LINEAR DIFFERENTIAL EQUATIONS. 863 


It will be shown that the additive constants can be determined in such a way 
that Vi(s), Vo(s),- satisfy the “ initial conditions ” 


(19) 0c) =0, where Vn( co) = lim Vm(s) (m = -) 


(the 0 in (19) denotes the zero matrix), and that Vin(s) then becomes uni- 
formly almost-periodic in the half-plane o = 0. 


For any matrix C = (ci), where 1=1,---,n and k=—1,-- -,n, let 
|C | denote the greatest of the n? non-negative numbers | ci, |. Then 
(20) |AB|Sn|A||B| 


holds for any pair of n-rowed matrices A, B. 


Since F(s) is supposed to be uniformly almost-periodic for o = 0, with 
a Fourier expansion (5) satisfying (8), it follows from the first assertion, 
(11), of the Lemma that | F'(s)| nc holds for c= 0 and for some c = const. 
But (11) and (10) imply (15). Consequently, from (8), 


(21) | F(s)| S if oZ 0. 


It follows therefore from (17) and from the Lemma that the case m =—0 
of (18) is satisfied by the function V,(s) =F*(s), which is uniformly 
almost-periodic for «= 0 (and differentiable, i.e., regular, for o > 0); that 
this V,(s) satisfies the inequality 


(22) | Vi(s)| if cZ0; 


finally, that all exponents occurring in the Fourier expansion of V,(s), being 
the same as those occurring in (5), are subject to the inequality (8). 

Suppose that, for a fixed m = 1, the function Vm(s) has been defined 
in such a way that it is uniformly almost-periodic for o20, and has a 
Fourier expansion which does not contain any exponent less than m. It is 
then clear from (5) and (8) that F(s)Vm(s) is uniformly almost-periodic 
for ¢ = 0, with a Fourier expansion in which no exponent is less than m + 1. 
Hence, if it is assumed that the given V»,(s) satisfies the inequality 


(23) | Vnn(s)| SS (ne)™ e™/m! if o 2 0, 
then, since (23), (21) and (20) imply that 
| F(s) Vm(s)| if 


the Lemma supplies the existence of a Vwm.i(s) having the following 
properties: Vinsi(s) is uniformly almost-periodic for o=0, satisfies (18) 
if >0, and 


864 PHILIP HARTMAN AND AUREL WINTNER. 


| Viner (S)| S n(ne)™*?(m + if 0. 


Since the inequality | /(s)| < nc, where o = 0, was the only restriction 
on the choice of c, it can be assumed that c=1. Then the last formula line 
implies that (23) remains true if m is replaced by m+ 1. On the other 
hand, (22) shows that (23) is true for m=1 if ¢21. This completes the 
induction (in particular (19) holds, by (23), for every m2 1). 


5. Put 
(24) W(s) +3 — Vn(s)} (Vo =). 


m= 
In view of (23), the series (24) is uniformly convergent in the half-plane 
o=0. Its sum, W(s), is uniformly almost-periodic for «= 0, since every 
term of (24) is. Furthermore, (23) and (18) show that the derived series, 


{Viner’(8) — Vn’ (8) }, 


m=0 
is uniformly convergent foro = 0. Hence, the function (24) is differentiable 
on the closed half-plane « = 0 and, in view of (18), satisfies (6). Finally, 
since the Fourier expansion of V,,(s) contains no exponent less than m, (24) 
shows that the Fourier expansion of the difference (7) has no constant term. 


THE JOHNS HopxKINS UNIVERSITY. 


REFERENCES. 


[1] A. S. Besicovitch, Almost Periodic Functions, Cambridge, 1932. 
[2] A. Wintner, “On the classical existence theorem of linear differential equations,” 
American Journal of Mathematics, vol. 71 (1949), pp. 331-338. 


| 
i 8 
| 
A 
BB. 
| il 
= 
sé 

( 
ge 
as 
Q 
is 
ch 
né 
a 
if 
is 
is 
se 
els 
ha 
th 
m 
Fj 


SEPARATION THEOREMS FOR BOUNDED HERMITIAN FORMS.* 


By Puitie HARTMAN and AUREL WINTNER. 


1. If 2a, S 2a, S- - -S 2a, are the lengths of the axes of an n-dimen- 
sional ellipsoid, and 20, = 2b,.- --2bn_, those of its projection on a 
hyperplane through the center, then a, Sb; S San 
A corresponding theorem holds for the (n—1)-dimensional projections of a 
quadric belonging to a quadratic form, say Qn(@,%2,°*-*,%n), which is 
indefinite or semi-definite. This is just a restatement of a separation theorem 
of Sturm, according to which the characteristic numbers, say Ay S Az S- - 
=A, and py pe SS OF Qn and Qu (0, - -, 
satisfy the inequalities 


(1) AS Spr = An 


(cf., e. g., [1], pp. 210-217 or [4], pp. 36-40). In fact, there is no loss of 
generality in assuming that the coordinate system (2,,° - -,2%n) is so chosen 
as to make the equations of the n-dimensional quadric and of the hyperplane 
Qn (21, %2,° =const. and respectively (even though Q, then 
is not the given Qn). 

The classical proofs of (1) depend on the consideration of Sturmian 
chains, constructed in terms of polynomials attached to the secular determi- 
nants. This apparatus is not available if Q, = Qn(21,° + *,%n) is replaced by 
a quadratic form in an infinity of variables, YQ = Q(2, %,° - +); not even 
if the latter is assumed to be completely continuous (in Hilbert’s sense) but 
is otherwise arbitrary. On the other hand, it is known that, tf Q(%, %2,° + -) 
is completely continous, then its spectrum consists of 0 and of an infinite 
sequence of eigenvalues which tend to 0; that 0 may or may not be an eigen- 
value (though it is always in the spectrum) ; finally, that every non-vanishing 
eigenvalue is of finite multiplicity (cf. [3], p. 148). In addition, Hilbert 
has proved that, if Qn—=Qn(%,%2,°**,%n) denotes the n-th section, 
Q(%1,° *,%n,0,0,- of a completely continuous Q = Q(%,22,° °°), 
then, as n—> ©, the spectrum of Q, tends to the spectrum of Q (even if the 
multiplicity of the eigenvalues of Q are counted); cf. [3], pp. 156-174. 
Finally, it is clear that the complete continuity of Q(2, assures 


* Received January 14, 1949. 


865 


866 PHILIP HARTMAN AND AUREL WINTNER. 


that of Q(0, v2, These facts imply that, if A’, A”, A* and 
denote points in the spectrum of and -), 
respectively, then there belong to every pair 4’, X” satisfying 4’ S A” some p* 
satisfying S SX”, and to every pair p’, »” satisfying p’ some d* 
satisfying wp’ = A* Sy”. In fact, this extension of (1) to completely con- 
tinuous forms results from (1) itself, if (1) is applied to the n-th section of 
Q (21, +), the m-th section of Q(0, x2, 73,- -), and then n> 0, m> w, 

For a direct verification of this extension of (1), cf. a deduction given 
by Weyl [6], p. 16%. His procedure fails if Q, instead of being completely 
continuous, is just bounded in Hilbert’s sense. The failure is due not 
merely to the possibility of a continuous spectrum, since the method can 
fail when Q is orthogonally equivalent to a diagonal form. Actually, Weyl’s 
method is readily seen to fail whenever the spectrum of Q has at least two 
cluster points (in the completely continuous case, 0 is the only such cluster 
point). 

Although the above procedure, consisting of (1) and of a limit process, 
applies in certain cases in which Weyl’s method fails, it does not apply to 
every bounded Q. The trouble is that the spectra of the sections Qn of a 
bounded Q are capable of clustering at a value A which is not in the spectrum 
of Y. This is shown by the example 


(Toeplitz). In fact, the matrix of the bounded quadratic-form (2) is an 
orthogonal matrix, and so the spectrum of (2), being situated both on the 
real axis and on the unit circle, cannot contain points distinct from A= + 1 
(incidentally, both of these points are eigenvalues of infinite multiplicity). 
On the other hand, the discriminant of the n-th section of (2) vanishes when 
n=1,3,5,---. Hence, AO 3s a cluster value of the spectra of the 
sections of (2), although A =0 is not in the spectrum of (2). 

Accordingly, an extension of (1) to arbitrary bounded forms cannot 
be obtained from (1) itself. It will, however, be shown that such an exten- 
sion exists. Needless to say, the method to be applied will have to be distinct 
from the determinantal approach, based on Sturmian chains, as well as from 
Weyl’s approach, referred to above. 


2. Let H be a bounded Hermitian matrix, z a point of the complex 
Hilbert space, and e an a of length 1. Project the Hermitian form 2*Hz 
on the hyperplane z*e 0 and denote by °H] =°H(e) the matrix of an 
Hermitian form which thus results on z*e 0. For instance, if H = (hum), 
where n,m —1,2,---, and if, without loss of generality, the hyperplane 


is 
fo 


wl 
bo 
po 
th 
[ 
th 
sp 
lir 
of 
(4 
Wi 
li 
( 
Ir 
nc 
fr 
bo 
T 
ap 
is 
p 
( ‘ 


SEPARATION THEOREMS FOR BOUNDED HERMITIAN FORMS. 867 


a*e = 0 is chosen to be a coordinate hyperplane, 7, = 0, then °H = (Ins msi), 
where n,m =—1,2,---. 

The theorem announced can now be formulated as follows: 

(*) Let "H =°H(e) denote any of the above-defined projections of a 
bounded Hermitian matrix H. Let » and X”, where X <2”, be a pair of 
points contained in the spectrum of one of the two matrices H, °H. Then 
the spectrum of the other matrix has at least one point on the closed interval 


Both of these assertions remain true in the limiting case, ’ =X”, (where 
the interval [X’,d”’] becomes the point d’) if X’ is either a cluster point of the 
spectrum, or a multiple point of the point spectrum, of H, °H, respectively. 

The (bounded, Hermitian) matrix °H = °H(e), operating on the given 
linear subspace, 


(3) == 0, 
of Hilbert’s space, can explicitly be represented as 
(4) (I—E)H(I—E), 


where J is the unit matrix and EH = E(e) denotes the matrix defined by the 


linear substitution 
(5) Ex = (e*r)e, (je|=1). 


In fact, the latter represents the operation of projecting the z-space on the 


normal of the hyperplane (3). 
Since every bounded linear form is completely continuous, it is seen 


from (5) that the matrix EF is completely continuous and so, in particular, 
bounded. It is also seen from (5) that the matrix Z is Hermitian, L* — £. 
This was used in (4), where H*—H, I*=—I. Finally, since repeated 
application of (5) gives E[ Ex] = (e*[e*x]e)e, which, in view of e*e—1, 
is equivalent to E[ = [e*r]e, it follows that — is an identity. 
This merely verifies the fact that the Hermitian matrix EF, representing a 
projection, must be idempotent, = 


8. The formal basis of the proof of (*) will be the fact that 
(6) (°H)*2 - = — | |? 


is an identity on the hyperplane (3). This identity can be verified as 
follows : 


PHILIP HARTMAN AND AUREL WINTNER. 


If 2 is on the hyperplane (3), it is seen from (5) that Fxr=0. In 
view of (4), this implies that °Hx— (I--F£)Hzx. It follows therefore from 
(4) that °“H(°Hxr) = (I—F)H(I—F)*Hx. But two applications of 
=F reduce this relation to °H(°Hx) = (HW —FH)*x. Since H* =H 
and £* = LH, it follows that the Hermitian form on the left of (6) is identical 
with 2*(H—HE)*x. This, in turn, is identical with «*(H?— HEH)z, 
since Accordingly, (6) is equivalent to =| x*He |?*. 
But = and (FH)* = HE, hence «*HEHx =| EHx|?*. On the other 
hand, the definition, (5), of shows that | EHx| =| e*Hzx|.. Consequently, 
(6) is equivalent to | | e*Hz|. 

This completes the proof of (6), since, H being Hermitian, 2*He 
= (e*Hz)*. 


4. In the proof of (*), it can be assumed that the mid-point of the 
interval X’ (or, if =X”, the point is i.e., that 
NM” =—N=0. In fact, this normalization, being just a shift of the origin 
of the A-axis, can be effected by adding to H a scalar multiple of the unit 
matrix; a modification which, on the subspace (3), is paralleled by the 
adding to °H of the same scalar multiple of the respective unit matrix. 

It follows that, according as [A’,’’] is a point or an interval, it can be 
assumed that ’ = 0 = X” or X’ =— 1, X” = 1. In fact, the latter normaliza- 
tion results, if A” — — 2’ > 0, by a change of the unit of length on the A-axis. 


Proof of (*) for a gwen H; XA~AX’. In this case, the assumption 
is that A= — 1 and A—1 are in the spectrum of H. The claim is that the 
interval —1=AX=1 cannot be free of the spectrum of °H. 

Suppose, for a moment, that the points A= + 1 of the spectrum of H 
are in the point spectrum of H. Then there exist two perpendicular unit 
vectors, say x, and 22, satisfying Hx, — 2, and These relations 
clearly imply that, if 2) is any unit vector which is a linear combination of 
2, and 22, then 2*H*x,—=1. Since the hyperplane (3) and the hyperplane 
spanned out by z, and 2, intersect along a linear manifold (of dimensionality 
1 or 2), it is possible to choose the above x= 2, so as to satisfy (3). Then, 
since (3) implies (6), 


(7) (°H) = 1— | | ? (| | =1). 


The existence of such an 2) was obtained under the assumption that 
A =— 1 and A=1 are in the point spectrum of H. If they are just cluster 
values of the point spectrum of H, it is clear from the proof of (7) that 


868 
| 
{ 
t 
|} e 
it 
| 
| tl 
ne 
be 
so 
h 
re 
of 
lov 
Fo 
0 < 
im 
the 
suf] 
( 
whi 
poi 
fro 


SEPARATION THEOREMS FOR BOUNDED HERMITIAN FORMS. 869 


the assertion of (7) becomes true if it is modified as follows: Corresponding 


to every « > 0, there exists a unit vector 2, = 2¢ satisfying 
(8) (°H) ] | | € (| Vo | 1, 
On the other hand, if A —1 and X= 1 are in the continuous spectrum of 


H, then, instead of the (weak) clustering of eigensolutions proper, an applica- 
tion of Hellinger’s eigendifferentials (cf. [2], pp. 240-242) leads to the 


existence of an 2) = 2 satisfying (8), where «> 0 is arbitrary. Finally, 
it is clear that the same holds if A — — 1 and A —1 are not in one and the 


same of the three possible components (viz., point spectrum, cluster set of 
the point spectrum, continuous spectrum) of the spectrum of H. 

*=0, it follows, by letting «e—0O in (8), that the 
greatest lower bound of the Hermitian form 2*(°H)*x on the sphere | 2 | = 1 


Since — | 


cannot exceed 1. On the other hand, this greatest lower bound cannot be 
negative, since the form is identical with |°H2|*. But this greatest lower 
bound always is in the spectrum of the form (cf. [7], p. 147). Consequently, 
some point of the interval 0=AX1 must be in the spectrum of (°H)?. 
In view of Toeplitz’s criterion for the existence or non-existence of a bounded 
reeiprocal (ef. [7], p. 138), this implies that some point of the interval 
—1=A=1 must be in the spectrum of °H. 


This proves the first assertion of (*). 


Proof of (*) for a given °H; XX”. The assumption now is that 
—=—1 and A=1 are in the spectrum of °H. The corresponding assertion 
of (*) will be proved it if is shown that, on the sphere | x | —1, the greatest 
lower bound of the non-negative definite form x*/*z cannot be greater than 1. 
For, if this is proved, it follows, as above, that some point of the interval 
0=AS11 is in the spectrum of H*, which, in view of Toeplitz’s criterion, 
implies that some point of the interval — 1 = A 1 is in the spectrum of H. 
Accordingly, it is sufficient to prove that, if \=-— 1 and X= 1 are in 

the spectrum ot °H, then, corresponding to every « > 0, some unit vector 
= must satisfy the inequality <i-+.. Actually, it will be 
sufficient to show that, if A=—1 and A—1 are in the point spectrum of 
°H, then some unit vector 2) must satisfy the equality 7*H*2,—1. For, 
if this is proved, it will be clear that the transition to the general case, in 
which both points A= + 1 are in the spectrum (but not necessarily in the 
point spectrum) of H, requires exactly the same steps as the transition 
from (7) to (8) above. 


— 
| 


PHILIP HARTMAN AND AUREL WINTNER. 


Let A= —1 and A~1 be in the point spectrum of °H. Then there 
exists a pair of perpendicular unit vectors, say z, and 22, satisfying 
°Hz, = — a, and 22. Clearly, 


(°H)*ay = 1 (| to | = 1) 


holds for any unit vector, 2, contained in the linear manifold spanned out by 
x, and z. Such an 2» can be found in any given hyperplane of the 2-space. 
Let this hyperplane be given as 


0, 


where e is the unit vector to which °H —°H(e) belongs. 

Since x, and z. are eigenvectors of °H, and since °H, being the result 
of projecting H, operates only on vectors contained in the hyperplane (3), it 
is understood that any linear combination of z, and 2, and therefore the 
above 2, must satisfy (3). But (3) implies (6). Hence, (6) is satisfied 
by z=2. It follows therefore from the last two formula lines that 
1 = x*H*x*, where | 2) |= 1. This is what had to be proved (for some 2). 


Proof of (*) in the case ’ =X”. Let K denote either of the matrices 
H, °H, and L the other matrix. Then both assertions made by (*) for the 
limiting case, X’ = X”, can be formulated as follows: The point A=0 must 
be in the spectrum of K if it is either a cluster point of the spectrum of L 
or a multiple eigenvalue of Z. In view of what has been proved above for 
the case of an arbitrary closed interval [)’, \’’], where V < X”, only the second 
of the contingencies, that of a multiple eigenvalue, need be considered ; simply 
because an [A’, A], with either \’ = 0, A” > 0 or X’ < 0, X” —0, can always 
be found if A 0 is a cluster point. , 

Accordingly, it is sufficient to show that Kx 0 must have a solution 
% == where | | =1, if has at least two linearly independent 
solutions, say 2, and r=. The latter can be assumed to be perpen- 
dicular unit vectors. But if they are used in the same way as the two eigen- 
vectors 2, % were used in the above pair of proofs in the case \’ NX”, it is 
clear that (6) leads, via (3), to a unit vector 2 satisfying 2)*(H°)?2, =0 
or %*H*x, = 0, « cording as L = H or L =°H. In other words, there exists, 
in either case, a unit vector satisfying Since 2*K*x 
= | Kz, | *, this proves that Kr, 0. 


5. Sturm’s separation theorem, (1), can be restated as follows: If 
A" SA" S++ -Sd," are the roots of the secular equation of the n-th 


870 
i 
0 
( 
e 
( 
( 
a 
( 
e 
0 
°j 
pe 
cl 
Ss 
t 
tl 
of 


SEPARATION THEOREMS FOR BOUNDED HERMITIAN FORMS. 871 


section, H\™ = {(hix);1Si,k Sn}, of an infinite Hermitian matrix, 
H= < then 


(9) Ai” = Ao" = An-1" <= = An” 


(n =2,3,---). Since A,, is the minimum, and A,” the maximum, of the 
Hermitian form of H‘” on the unit sphere, it is clear from Hilbert’s definition 
of the boundedness of H that H is bounded if and only if the values occurring 
in the infinite triangular matrix {(An";1lmn< o} are contained in 
a bounded interval. It follows therefore from (9), and from the convergence 


of every bounded monotone sequence, that if m—1,2,---, then 

and 

(10 bis) = lim An-m+1™ 


exist. Furthermore, by (9), (10) and (10 bis), 


Finally, (11) shows that 


(12) lim Nm 
and 
(12 bis) lim Xm 


exist, and that the value of (12) does not exceed that of (12 bis). 


If s is the spectrum, c the continuous spectrum, and p the point spectrum, 
of a bounded H, let s, denote the set of those points of s which are contained 
“in one, at least, of the following three sets: c, the derivative of », and those 
points of p corresponding to which H has an infinity of linearly independent 
characteristic vectors. In other words, the set so, the so-called essential 
spectrum of H, is the derivative of s, with the proviso that every proper 
eigenvalue of infinite multiplicity is counted as a cluster point of s. Clearly, 
the essential spectrum of /7 is bounded and not vacuous. It is also clear 
that H is completely continuous if and only if its essential spectrum consists 
of the single point A = 0. 

The sectional spectra (A,",* + *,An”) are not of course unitary invariants 
of H. It is this circumstance which makes possible a situation of the type 


8 

It 
it 
d n—> 
st 
yr 
d 
ly n—> © 
n 
t 
1- 
is 
0 
8, 
Lo 
if 
h 

9 


872 PHILIP HARTMAN AND AUREL WINTNER. 


described after (2). Situations of that type are, however, strongly limited 
by the following theorem: 


(**) Let r’ denote the least, and X” the greatest, value occurring in the 
essential spectrum of a bounded Hermitian matrix H, and let (A,", An") 
be the spectrum of the n-th section, H™, of H. Then the limit (12), defined 
by (10), is identical with X’. 


Furthermore, if no eigenvalue of H is less than X’, then each of the limits 
Nm equals X’. If H has an infinite sequence of (possibly multiple) eigenvalues 
which are less than 2’, then this sequence is identical with the sequence 
NiSN2S::+:. Finally, if H has a finite number, say 1, of (not necessarily 
distinct) eigenvalues which are less than X’, then these eigenvalues are repre- 
sented by the values S++ whilst Nis, Nis2,* become equal to 


Corresponding statements hold for (10 bis), (12 bis) and 2”. 


For the case of a completely continuous H, the assertions of (**) were 
proved by Schur [5], p. 297%. He has also shown (loc. cit., p. 291) that, 
if H is just bounded, then (10), (10bis), hence (12), (12 bis), are unitary 
invariants of H. The assertions of (**) determine the spectral meaning of 
these unitary invariants. Needless to say, (**) contains the unitary invariance 
of (10), (12). 

In the proof of (**), use will be made of the set, say s*, of the cluster 
values of the sectional spectra, (Ai",°--°,An”). In other words, s* will 
denote the set of those A-values corresponding to which there exist two 


(13) Am" A as k—> oo, where n= ny —> ©; mM mM. 


Clearly, s* is a closed set. It is known that the spectrum of H is in s* 
(cf. [7], p. 218), and that the least and the greatest values of s* are con- 
tained in the spectrum of H (ibid., p. 147). On the other hand, the example 
(2) shows that not every point of s* need be in the spectrum of H. 


6. Consider first the case in which, on the one hand, the least value, ’, 
contained in the essential spectrum, so, is the least value contained in the 
spectrum, s, and, on the other hand, d’ is not in the point spectrum. The 
first of these assumptions means that ] = 0; cf. the wording of (**). In view 
of the second asumption, this implies that the interval X’<A< Xd’ + con- 
tains points of s whenever «> 0. Furthermore, 2’ is the least value contained 
in s*. Hence it is clear from (10) that 1’, =N’. 


re] 
otk 
the 


¢ 

| 

( 
( 
t 
le 
Ww 
de 
to 
ler 
u 
A 
Xm 
(1: 
\ 


SEPARATION THEOREMS FOR BOUNDED HERMITIAN FORMS. 873 


It will be shown that \’m =A’ holds for every m. In view of (11), this 
will prove the assertions of (**) under the present pair of assumptions. 

Suppose, if possible, that A’s4Am, hence A’; < X’m, holds for some m. 
Then s* has at most m— 2 points, hence just a finite number of points, on 
the interval X’)<A<A’m. Consequently, this interval cannot contain an 
infinity of points of s. Hence, \’ =A’, is not a cluster point of s. Since it 
is assumed that X’ is not in p, it follows that X’ is not in s. But this contra- 
dicts the definition of 2’. 


7. Without the particular assumptions made at the beginning of § 6, 
the proof of (**) proceeds as follows: 

Let A, S++ denote the finite or infinite (possibly vacuous) 
sequence of those eigenvalues of H which are not greater than 2’, the least 
value contained in sy. Then d, is the least value contained in s*, and so 
Ai =X’. 

Suppose, therefore, that A, -,Am=A’m has been proved for a 
fixed m. Suppose further that, in the sequence there occurs an 
(m + 1)-st eigenvalue, Am. It will be shown that 


(14) Am+1 — 
If w= (u,U2,* - *) is a point in Hilbert’s space, let "w denote either 
the point (11, -, Un, 0,0,- -) of the same space or the point (w, 


of the n-dimensional space. This two-fold meaning of will not 
lead to a confusion below. For the sake of convenience, let linear forms, 
written above as matrix products, be now written as scalar products, (u, v). 


8. Under the assumptions made before (14), let +, @ms1 
denote mutually perpendicular unit vectors which are eigenvectors belonging 
to Ax, * Am; Amsi, Tespectively. For every fixed n > m, consider the prob- 
lem of minimizing the n-th section, "**H"x = (H "z, "x), of c*Hz = (Hz, x) 
under the 1 + m conditions 


| |—=1 and ("e;,"c) =0, where j =1,-- -,m; n> mM. 


According to the so-called Rayleigh principle, which simply follows by a 
repeated application of (9), this minimum is not greater than Am,.". On the 
other hand, since (H “x, "x) = (Hz, x) and (e;,"x) = 0, where —1,---,n, 
the definition of Am,, shows that the minimum in question is not less than 
Consequently, Ams: S Ansi™ 

Since the last inequality and (10) imply that Ams: Sma, the proof of 
(14) will be complete if it is shown that Ams 2 mat. 


? 
) 
f 
e 
T 
ll 
0 
* 
1l- 
1¢ 
WwW 
ad 


PHILIP HARTMAN AND AUREL WINTNER. 


9. To this end, let f;",- --,fn" be a (complete) orthonormal set of 
eigenvectors of the n-th section, H), of H (i.e., let 


r"fi", (f:", fr”) (ein), 


where (ei) is the n-rowed unit matrix), and let n be greater than the m 


occurring in the given Ams. If e,° + +, @ms, are defined as at the beginning 
of § 8, choose m+ 1 scalars @,",- + -,@%m,,”" in such a way that the vector 
m+1 
k=l 
satisfies the 1-+ m conditions 
| gn | =1, fi") =, 


It is readily seen from this definition of the points % = gins, Jms2s* * 
of Hilbert’s space that they must strongly cluster at some point, say at 
xz = d (in fact, only a finite number, m + 1, of dimensions need be considered). 
In other words, if a suitable subsequence of 9m.1, Jmsz,* * * is denoted simply 
bY Jms2,* then | gn—d|—+0 as n— On the other hand, by 
the definition of the frontal superscript, |"d—d|—0. Since 


("d, fi”) (“4 —d, fi") + (d — gn, fi*) + (gn; fi"), 
it follows that 


("d, fi”) +0 as n— ow, where i=1,-- -,m. 


Since Ami” is the (m+ 1)-st eigenvalue of and since (H "2, "2) 
— (Hz,x) as n— © whenever | 2| < ©, it follows that 


(Hd, d) = lim inf (Am." | "d | ?). 


But lim|"d|=—|d|. Furthermore, |d|—1, since |gn|—1 and 
lim | gn.—d|=0. It follows therefore from (10), and from the last 
formula line, that (Hd, d) = 

Finally, it is clear from the definitions of Ams, gn and d_ that 
(Hd,d) =Ams:. It follows therefore from the preceding inequality that 
Nmar S Amer. Since Amar S Amer Was proved in § 8, the proof of (14) is 
now complete. 

Consequently, if the sequence A; SA» of eigenvalues not 
exceeding 2’ is infinite, the assertions (**) follow. On the other hand, if this 
sequence contains only a finite number, /, of eigenvalues, the proof of (**) 
can be concluded by an application of the arguments used in § 6. 


ev 


no 
in 


(1 
wh 


ca 
tri 


874 
b 
( 
w 
| if 
1 
Wi! 
n 
na 


SEPARATION THEOREMS FOR BOUNDED HERMITIAN FORMS. 


APPENDIX. 


Let s be the spectrum of a bounded Hermitian matrix H, and 
Sn = An”) the spectrum of the n-th section, H™), of H, finally s* 
the set defined near the end of §5. The fact that s is contained in s* can 
be strengthened as follows: 


If X ts in s, then there exists in every s, a point Am", where m = mn, 
which satisfies 


(15) Am" >A as n—> Where li m—=m=n; 
(cf., [7], p. 218). An equivalent formulation is as follows: 


If there exist a number e > 0 and a sequence of integers ny << me <-+* 
with the property that the interval (A—e,rA + €) contains no points of sy, 
where n= ny, and k=—1,2,--., then A is not in s. 


On the other hand, as pointed out in § 5, it is not true in general that, 
if there exist two sequences of integers m, m2,- and mm, where 
liomSn, and and a corresponding point Am" of 
which satisfy (13), then A is in s. This is illustrated by the H of the 
quadratic form (2). For, in this case, Am” 0 when m=m,—k-+1 and 
n= ny, =2k +1, but A=limaA,” —0 is not in the spectrum of (2). 

The idea leading to the example (2) can be refined as follows: 


(i) In the case of a bounded Hermitian matrix, a point r need not be 


in s even if it is in every s,; in particular, it is possible that X 1s not in s 
even if there exists in every 8, @ Am" satisfying (15). 


This possibility can be further refined: 


(ibis) In the case of a bounded Hermitian matriz, a point need 
not be in s even if there exist k==ky potnts < Anw™ < An” 
in 8, which satisfy 


(17) Ana” >A and DA AS ©, 


where OShSh,Sn—k and 


There arises the question whether the possibility admitted in (i), (ibis) 
can be ruled out for certain types of bounded Hermitian matrices. It is 
trivial that these possibilities cannot occur for diagonal matrices. It is 
natural to inquire whether or not they can occur in the case of Jacobi 


875 


876 PHILIP HARTMAN AND AUREL WINTNER. 


matrices, which, according to Hellinger and Toeplitz, are the normal forms, 
under unitary transformations, of matrices with simple spectra. These 
matrices are defined by the conditions 


where n,m = 1,2,---. Actually, the matrix of (2), which is not of Jacobi 
type, can be adjusted to the case of a Jacobi matrix. 


(ii) In the case of a bounded Jacobi matrix, a point rX need not be 
in s tf there exists a subsequence of the sets s,,82,-- + such that the k-th 
element of the subsequence contains a point Am" which satisfies (13), where 

= mM, and n= nx. 


On the other hand, the possibility claimed by (ibis) cannot occur for 
the case of a Jacobi matrix. More than this exclusion is contained in the 
following theorem: 


(iii) In the case »f a bounded Jacobi matrix, a number X must be ins 
if every s, contains two points Am", Ame", where m= my, which tend to i, 
as ©. 


What ts more, the interval Am" SAS Ams" contains at least one point r 
of s for any n and for any pair of successive points Am", Ams" Of Sn. 


It remains undecided whether or not the possibility in the second part 
of (i) can occur for a bounded Jacobi matrix; that is, whether or not (ii) 
remains true if “subsequence of the sets s:, s2,- - -” is replaced by “ sequence 


of the sets °°.” 


Proof of (i). Let the sequence of integers 1, 2,- - - be divided into 
two mutually exclusive subsequences, 1; << iz and ji < Let 
H be the matrix defined by placing mn equal to 0 or 1 according as the pair 
(m,n) of integers does not or does have either of the forms (tx, jx), (Jz, tx) 
for some k~1,2,---. It is clear that H is bounded and Hermitian and 
that each row (column) of H contains one and only one element distinct 
from 0, while the non-vanishing element is always 1. Consequently, H is an 
orthogonal matrix, and so its spectrum consists of the eigenvalues \ = + 1 
(of infinite multiplicity). On the other hand, if % is chosen to be 3k for 
k=1,2,---, then H™ has at most 2[n/3] elements different from 0. 
Thus 40 is in s,, with a multiplicity not less than n— 2[n/3] ~ n/3. 
It follows that the possibilities mentioned in (i) are realized by H. 


a 
D 
j re 
E 
| is 
0 
It 
as: 
| 
lat 
y po 
qu 
fix 
eq 

(1 
If 
Hi 
sp 
Co 
(2 


SEPARATION THEOREMS FOR BOUNDED HERMITIAN FORMS. 877 


Proof of (ibis). This H realizes, with k =k,»=—n— 2[n/3], the 
possibility mentioned in (ibis) if the assertion (ibis) is interpreted in a 
wider sense, which allows Any." = = 0 when is in with 
a multiplicity k. It is clear, however, that there exist real diagonal matrices 
D such that, when H is the matrix constructed above, H + D fulfills the 
requirement of (ibis). In fact, it is seen from Toeplitz’s criterion for 
a bounded reciprocal (cf. [7], p. 238) that, if «, is the n-th diagonal element 
of Dn and $> | «,|—>0, as n— oo, then AO is not in the spectrum of 
H+D. On the other hand, if n is fixed and h is an index of one of the 
(at least) n—2[n/3] columns of containing only zeros, then en 
is an eigenvalue of the n-th section of H + D. 


Proof of (ii). The matrix of the form (2) fails to be a Jacobi matrix 
because hon = = 0 for n—1,2,---. But let this matrix be 
modified by placing =0, hmn = 0 for | m—n| > 1, and = 
=1 (as in (2)) and finally hen ons = Ronson = €n for n —1, 2,- - -, where 
0<e—->0 as n—o. The resulting matrix is a bounded Jacobi matrix. 
It is seen inat if the numbers «, are suitably chosen, then the possibility 


asserted in (ii) is realized. 


Proof of (iii). It is clear that it is sufficient to prove the last part of 
(iii). If A is a fixed value, the replacement of H by H —AI merely trans- 
lates the spectra s, s,, by —A. If A is chosen to be $(Am" + Amin”), the 
points Am", Ams” become — A*, A*, where A* = 4(Anu”—Am") > 0, Conse- 
quently, it is sufficient to show that if —A* and A*(> 0) are in s, (for a 
fixed n), then s contains at least one number A satisfying | A|SA* or, 
equivalently, that there exists a vector x= (2, 22,° - -) such that 


(19) | Hx | Sa* (|2|—1). 


If x represents either a point of the type (2,°-°-,%n,0,0,---) in the 
Hilbert space or the corresponding point (2,,- + -,2,) of the n-dimensional 
space, then, since H is a Jacobi matrix, it follows from (18) that Hxr=y 
= (41, *), where 


Consequently, | Hz |?=|H™a|*+ | Rnsintn|*. Hence, 
(20) | Hz|=|H™a| if —0. 


On the other hand, the finite matrix H™ has two eigenvalues —A*, 


878 PHILIP HARTMAN AND AUREL WINTNER. 


A* and corresponding eigenvectors satisfying Ha! = — )*z}, 
Hz? Hence, the orthogonality of 2? implies 

| | = 2], 
if x is a linear combination of z' and 2. Since 2’, 2 are linearly independent, 


the proviso in (20) holds for such an x (with |«#|—1). The last formula 
line and (20) imply (19). This completes the proof of (iii). 


THE JOHNS HOPKINS UNIVERSITY. 


REFERENCES. 


[1] R. Fricke, Lehrbuch der Algebra, vol. 1, Braunschweig, 1924. 

[2] E. Hellinger, “ Neue Begriindung der Theorie quadratischer Formen von unend- 
lichvielen Verinderlichen,” Journal fiir die reine und angewandte Mathe- 
matik, vol. 136 (1909), pp. 219-271. 

[3] D. Hilbert, Grundziige einer allgemeinen Theorie der linearen Integralgleichungen, 
Leipzig and Berlin, 1912. 

[4] E. J. Routh, The advanced part of a treatise on the dynamics of a system of rigid 
bodies, London, 1884. 

[5] I. Schur, “ Ein Beitrag zur Hilbertschen Theorie der vollstetigen quadratischen 
Formen,” Mathematische Zeitschrift, vol. 12 (1922), pp. 287-297. 

[6] H. Weyl, “ Ueber das Spektrum der Hohlraumstrahlung,” Jowdrnal fiir die reine 
und angewandte Mathematik, vol. 141 (1912), pp. 163-181. 

{7] A. Wintner, Spektraltheorie der unendlichen Matrizen, Leipzig, 1929. 


h 
n 
h 
a 
le 
ig 
d 
nh 
W 
V 
a 
at 
of 
ca 
th 
pl 
(1 
(2 
wl 
of 
we 
the 
(3 


AFFINE INVARIANTS OF A PAIR OF HYPERSURFACES.* 


By Cuuan-Cuin Hstune. 


1, Introduction. In previous papers [1; 2; 3, pp. 573-578] the author 
has derived a projective invariant, together with metric and projective 
characterizations, for two surfaces having a common tangent plane at two 
nonsingular points in ordinary space, and has extended the result to two 
hypersurfaces in projective hyperspace. Santalé [4, pp. 564-569] has obtained 
affine invariants, together with metric and affine characterizations, for simi- 
larly related pairs of surfaces in ordinary space. The purpose of this note 
is to study the problem of Santalé for two hypersurfaces Vy_1, V*n_, in n- 
dimensional space (n= 3) having a common tangent hyperplane at two 
nonsingular points 0, 0*. We shall confine our attention to the case in 
which the line 00* is not an asymptotic tangent of either of the hypersurfaces 


Va-15 


2. Derivation of invariants. Let Vn, V*,_, be two hypersurfaces in 
a space S, of n (= 3) dimensions having a common tangent hyperplane ty_, 
at two nonsingular points 0, 0* the line 00* not being an asymptotic tangent 
of either of the hypersurfaces Vn, V*n-. First, we choose an orthogonal 
cartesian coordinate system with the point 0 as the origin, the line 00* as 
the x-axis, and the common tangent hyperplane ¢n_, as the coordinate hyper- 
plane z, 0. Then the power series expansions of the hypersurfaces Vy_,, 
V*,-. in the neighborhoods of the points 0, 0* may be written in the form 


n—-1 


4,k=1 


n-1 n-1 
(2) Ln = (21 — 2 — h) a. +: 


4,k=2 
where h is the distance between the points 0, 0* and /,,m,, 0. 
In order to find the affine invariants determined by the neighborhoods 
of the second order of the hypersurfaces Vy, V*n. at the points 0, 0*, 


we consider the most general affine transformation which leaves the point 0, 
the z,-axis and the hyperplane x, = 0 unchanged: 


n . 
(3) Xi = Dd Ln = (1=1,: -,n—1), 
k=1 


* Received February 28, 1949. 


879 


880 CHUAN-CHIH HSIUNG. 


where dz; = =* * Qn-1,, = 0, the remaining coefficients being arbitrary, 
but 

A22 

(4) D= 0. 


The effect of this transformation on equations (1), (2) is to produce two 
other equations of the same form whose coefficients, indicated by accents, 
are given by the formulas 


n-1 
, “4 
Onnl 11 == 0734111, Annl tk = (1, k = 2,- -,n—1), 
(5) r,g=1 
, 
a,,h’ = h, = = > Ari 
r,8=1 
+ 
where my = my (t= 2,:--,n—1). 


From equations (4), (5) it is easily seen that the determinants 


L= M= 

In-1,2 Mn-1,1 Mn-1,2 * Mn-i,n-1 


and their transformed ones L’, M’ are connected by the relations 


(6) = a*,,D°L, a" = a?,,D?M. 

Further elimination of aj, from equations (5), (6) shows that the two 
quantities 

(7) == 1,;/m1, J =L/M 


are two affine invariants determined by the neighborhoods of the second order 
of the hypersurfaces Vn, V*n at the points 0, 0*. 


3. Metric and affine characterizations of the invariants. Let K, K* 
be the curvatures of the hypersurfaces Vn, V*n_, at the points 0, 0*; and 
let R, R* be the curvatures at the points 0, 0* of the plane curves C, C* 
of section of the hypersurfaces Vn_1, V*n-1 by the plane n of the line 00* and 
the normal to the common tangent hyperplane ¢,_, at any point on the line 
00*. Then from equations (1), (2) and a well known formula [Cf. 5, pp. 
313-314] for the curvature of a hypersurface at a nonsingular point, it is easy 
to obtain the following metric characterizations for the two invariants I, J: 


| 
b 
( 


AFFINE INVARIANTS OF A PAIR OF HYPERSURFACES. 


(8) I=R/R*, J=K/K*. 


In order to find an affine characterization of the invariant J, let us 
consider any affine transformation, which transforms the hypersurfaces V»_,, 
V*,-1 into two other hypersurfaces Vn_,, having the transformed hyper- 
plane f,-, of ¢,-, as a common tangent hyperplane at the transformed points 
0, 0* of the points 0, 0*, and which transforms the common normal plane n 
through the line 00* of the hypersurfaces Vn_1, V*n-1 into another plane m 
which passes through the line 00* and makes an angle 6 with the common 
normal plane #% through the line 00* of the hypersurfaces Vn, V*n+. It 
R, R* be the curvatures at the points 0, 0* of the plane curves C, O* of 
section of the hypersurfaces Vn, V*n_, by the plane m, then R/R* = R/R*, 
Santalé [4, pp. 559-560] having shown that the ratio of the curvatures at 
two nonsingular points 0,, 0. of two plane curves with 0,0. as a common 
tangent is an affine invariant. Furthermore, if r, r* be the curvatures at 
the points 0, 0* of the plane curves of section of the hypersurfaces Vn_1, 
V*,-1 by the common normal plane 7, then by the well known theorem of 
Meusnier for the three-dimensional space determined by the two planes m, fi, 
we obtain r= FR cos 0, r* = R* cos6. Thus R/R* —r/r* and making use 
of a result of Santalé [4, pp. 560-561] we therefore arrive at the following 
affine characterization of the invariant J. 


In the common normal plane n, let f be the area bounded by a line I 
parallel to the line 00* and by the curve C in the neighborhood of the 
point 0 and let f* be the area bounded by the line 1 (or its symmetric line 
with respect to the line 00*) and the curve C* in the neighborhood of the 
point 0*, then I =lim (f*/f)? as the line 1 approaches the line 00*. 


To characterize affinely the other invariant J we consider the pencil of 
hyperquadrics in the hyperplane z,—0 determined by the two asymptotic 
hypercones of the hypersurfaces Vn, V*n-, at the points 0, 0*. It is easily 
seen that in this pencil there exist n—1 hyperparaboloids which are given 
by the equations 


n-1 n-1 
(9) Ln = > AjMix) Vile 2Ajh = Ajm,,h? = 0, 
4,k=1 4=1 
(j=1,: -,n—1), 
where A; (j= 1,:--,n—1) are the roots of the equation in A 
Toy + Ame; loo + ote AMe,n-1 0 


(10) | 
+ AMn-1,1 +Amni2° + AMn-1,n-1 | 


881 
| 


882 CHUAN-CHIH HSIUNG. 


The line 00* intersects each of the hyperparaboloids (9) in a pair of points. 
Let P; be any one of each pair of these points, then 


(11) D; = Pf/P0* = + [— (j = 1,- *,n—1). 
Thus the invariant J can be expressed in terms of the invariant I and the 
n—1 ratios D,,D2,- + -,Dn+ of distances between points as follows: 
(12) J= (—1)"J""(D, - 


4. Discussion. Finally, it should be noted that the affine invariants 
I, J are not projective invariants, but the affine invariant J("*»/3/J is a 
projective invariant [3, p. 578]. 

Moreover, if the line 00* is an asymptotic tangent of either or both of 
the hypersurfaces V,_,, V*n_1, then either /,, = 0 or m,, =0 or = m,, = 0. 
In each of these special cases we can only determine an affine invariant J 
given by the second of equations (7) with one of the above conditions on 1,,, 
m,,. ‘The metric characterization of this invariant J obtained in 8 still can 
be used, but the affine one no longer has a meaning. 


THE UNIVERSITY OF WISCONSIN. 


REFERENCES. 


1. C. C. Hsiung, “ Projective invariants of a pair of surfaces,” Duke Mathematical 
Journal, vol. 10 (1943), pp. 717-720. 

2. , “A projective invariant of a certain pair of surfaces,” Duke Mathe- 
matical Fusion, vol. 12 (1945), pp. 441-443. 

3. , “Some invariants of certain pairs of hypersurfaces,” Bulletin of the 
American Mathematical Society, vol. 51 (1945), pp. 572-582. 

4. L. A. Santald, “‘ Affine invariants of certain pairs of curves and surfaces,” Duke 
Mathematical Journal, vol. 14 (1947), pp. 559-574. 

5. A. Terracini, “ Densité di una corrispondenza di tipo dualistico, ed estensione 
dell’invariante di Mehmke-Segre,” Atti della Reale Accademia delle Scienze di Torino, 
vol. 71 (1936), pp. 310-328. 


j 
{ 
( 
s 
] 
0 
m 
pa 
Wi 


COMPLETELY SIMPLE IDEALS OF A SEMIGROUP.* 


By R. P. Ricu. 


This note is concerned with the following question: When does a semi- 
group S contain a completely simple ideal M—that is, an ideal which is itself 
a completely simple semigroup ? 

One answer to this question in terms of minimal ideals was given by 
A. H. Clifford: + If S contains no nilpotent ideals and the minimal two- 
sided ideal M of S contains a minimal left ideal Z and a minimal right ideal 
R, then M is a completely simple semigroup. 

In the present note we observe (Theorem A below) that the condition 
that S have no nilpotent ideals is unnecessarily restrictive and that we need 
only make the (necessary) assumption that M itself be non-nilpotent. 

A further, and in a sense complementary, answer to the question 
(Theorem B below) is that if § contains a minimal left ideal Z and a 
minimal right ideal R such that LR (0) and RL ~ (0), thn M—LR 
is a completely simple semigroup (and in particular a non-nilpotent minimal 
two-sided ideal). It should be noticed that in (A) we start with a minimal 
two-sided ideal and state conditions that it be completely simple, while in 
(B) we start with minimal left and right ideals and produce a minimal two- 
sided ideal which is completely simple. 


THEOREM A. Jf M is a minimal two-sided ideal of the semigroup 8, 
if M contains a minimal left ideal L and a minimal right ideal R, and if 
M*? ~ (0), then M is completely simple. 


The only ideals whose non-nilpotence is relevant to the proof given by 
Clifford + are in fact contained in M, so to establish Theorem A we need 
only prove 


lLemMaA 1. Jf M is a minimal ideal of the semigroup S(with zero) and 
M? ~ (0), then M contains no proper nilpotent left or right ideal of 8. 


Proof. Suppose L is a proper nilpotent left ideal of S contained in M. 


* Received February 14, 1949. : 

1A. H. Clifford, “ Semigroups without nilpotent ideals,” American Journal of Mathe- 
matics, vol. 71 (1949), pp. 834-844, Theorem 3.1. The reader is also referred to this 
paper for the definition of terms used here. Here, as there, 8 is an arbitrary semigroup 
with zero. 


883 


« 
\ 
y \ 


884 R. P. RICH. 


Then L LU LS is a nilpotent two-sided ideal of S contained in M and $ (0), 
hence L |) LS = WM since M is minimal, and M is nilpotent, contrary to 
hypothesis. 


LemMa 2. Jf L and R are minimal left and right ideals, respectively, 
of the semigroup S (with zero) and LR A (0), then: 


a) if then ir~0; 
b) LR is a minimal two-sided ideal of 8. 


Proof. Evidently DR is a two-sided ideal of 8S. Let 0l’eL and 
0Ar’ eR, with I’r’ e T, an ideal of S contained in LR. Let R’ be the set of 
elements r of RF such that l’re T. For any r in FR’ and any s in S we have 
l’rs e T since T is an ideal, so rse R’ and R’ is a right ideal (0) of S con- 
tained in the minimal right ideal R; hence RP’ = R and VRCT. By a dual 
argument, Lr C T for every r 40 in R, so LR CT and LR is minimal. In 
particular, if l’r’ —0 let J = (0) and then the same argument shows that 
LR = (0), contrary to hypothesis. 


Lemma 3. Jf L and R are minimal left and right ideals, respectively, 
of S, and (0); RL (0), then LRL=L. 


Proof. Let 0Al,eL and 0¥7r.l,eRL. Since rlo.eR we have 
0 by Lemma 2a. Hence LRL (0). But DRL is a left ideal of § 
contained in the minimal left ideal L, so LRL = L. 


THEOREM B. Jf L and R are minimal left and right ideals, respectively, 
of the semigroup S (with zero) and LRA (0), (0), then LR is a 
completely simple semigroup. 


Proof. Let MLR. Then M is a minimal two-sided ideal of S by 
Lemma 2b. By Lemma 3, M?=—LRL-R=LRA~(0). Since RLCR we 
have L = LRL C LR=M, and, dually, ROM. Thus Theorem A applies 
and M —JZR is completely simple. 

A converse of Theorem B may be stated: 


TuEorEM C. If the semigroup S (with zero) contains a completely 
simple ideal M, then M=LR, where L and R are minimal left and right 


ideals, respectively, of S and RL (0). 


Proof. Let M be a completely simple ideal of S and let e be one of its 
primitive idempotents. Then ZL = Me is a minimal? left ideal of M and a 


? Ibid., proof of Theorem 3. 2. 


| 


); 


to 


COMPLETELY SIMPLE IDEALS OF A SEMIGROUP. 885 


left ideal of S, hence a minimal left ideal of 8. Dually, R = eM is a minimal 
right ideal of S. Since M is simple, LR=Me?M=—M. And finally, RL 
contains e* 0. 

The reasoning of Theorem C shows that both conditions LR A (0), 
RL > (0) are necessary conditions for Theorem B. The following examples 
show that they are independent, and hence must both be explicitly assumed. 


Example 1. Let S = (0,1, r,b) with Ir = 6 and all other products = 0. 
Let L = (0,1), R= (0,r). Then LR contains but RL = (0). 


Example 2. Let S = (0,a,1,r,s), with the table: 
0 


a 2 6 6} 3 
w 


Let L = (0,a,1), R= (0,a,r). Then RL contains a0, but DR = (0). 
In both examples Z is a minimal left, R a minimal right ideal. 


THE JOHNS HorkKINS UNIVERSITY. 


| 
id | 
of 
ve 
al 
in | 
at 
ve 
Ly, 
a 
by 
we 
ies 
ely 
its 
la 


ON THE CLASSICAL EXISTENCE THEOREM OF ANALYTIC 
DIFFERENTIAL EQUATIONS.* 


By A. Coppinaron and AUREL WINTNER. 


1. Let f1,- where fj =fi(s;wi,- be nm power series in 
W,,° . *, Wn, say 


k=0 m=0 


where the coefficients a—a(s) are Laplace-Stieltjes transforms, 


(2) -f (a), 


1 


of (complex-valued) functions a(x) of bounded variation on the half-line 
1=2z< «. Suppose that the total variations, 


(3) [a] = f | da(z)], 
1 
of the functions «(a) satisfy the n conditions 
k=0 m=0 


for some, sufficiently small, positive r. In particular, the functions (1) are 
regular in the domain 


(5) Rs > 0, [ml <r 
of the n+ 1 complex variables s, w;. 
The following theorem will be proved: 


In a domain (5), let fi,- + -+,fn be regular funetions representable, in 
terms of functions (2) satisfying (4), in the form (1). Then there exists 
a half-plane Ris > 1 (= 0) on which the system 


(6) dw;/ds = fi(s;W1,° Wn) 


* Received November 8, 1948. 
886 


| 
| 


C2 


ine 


re 


in 
ts 


ANALYTIC DIFFERENTIAL EQUATIONS. 


has a unique solution representable in the form 


1 


In addition, the n integrals (7) are absolutely convergent in the half-plane 
Rs > 1, if 1 ws large enough. 


In particular, the solution (7) of (6) belongs to the “ initial condition ” 
(8) =0, where wi(o) —lim wi(s) as 


If (1) is replaced by 


j=0 k=0 m=0 

where every a is a constant, the classical existence theorem of analytic systems 
of ordinary differential equations can be formulated as follows: If each of 
the n power series occurring in the last formula line is (absolutely) convergent 
in some domain, say |z| <1; |wi|<7,---+,|wn| <7, then there exists 
a circle | z| <b on which the system dwi/dz = F; has a (unique, regular) 
solution w;—wi(z) satisfying the initial conditions wi{0)—0. This 
theorem of Cauchy is the simplest particular case of the theorem to be 
proved. In order to see this, it is sufficient to put z= e-* in (1), (2), (6), 
and assume that each of the coefficient functions, a(s), in (1) is a power 
series in e~*, i.e., that each of the functions a(a) occurring in (2) is a step 
function the jumps of which are restricted to r= 1,2,---. 

For the case in which (6) is a linear system, the above theorem was 
recently proved,’ along with a refinement generalizing the fact that, in case 
of linear systems, the existence theorem need not be “ localized ” by the choice 
of a “sufficiently large” 7. The treatment of the non-linear cace will depend 
on an adaptation of the procedure used loc. cit. 


2. In order to simplify the formulae, the proof will be given for the 
case, n = 1, of a single differential equation. It will be clear that the case 
of an arbitrary n can be treated in just the same manner. As a matter of 
fact, n > 1 can always be reduced to n==1, if the process involved is based 
on the method of majorants, which will be the case below. 

If n 1, then (6) reduces, by (1) and (2), to the differential equation 


1A, Wintner, “On the classical existence theorem of linear differential equations,” 
American Journal of Mathematics, vol. 71 (1949), pp. 331-338. 


10 


887 

| 


888 EARL A. CODDINGTON AND AUREL WINTNER. 


(9) dw/ds w™ f (2x), 


m=0 
1 


where, according to (4), there exists an r > 0 satisfying 


(10) [am < 00. 


m=0 


The assertion is that (10) implies for (9) the existence of a unique solution 
representable, on some half-plane ts > 1, as an absolutely convergent Laplace- 
Stieltjes integral, 
(11) w(s) = f estdB(x). 

It can of course be assumed that the given functions a%m(az), occurring 
in (9), are normalized by 


(12) a(1) =0 
and 
(13) =a(t— 0) if o. 


All the functions 8, 8;, to be defined below, will be understood to have been 
normalized by the cases a=, Bx of (12) and (13). 

If (11) is substituted into (9), a “comparison of the coefficients” 
supplies the following characteristic condition for the unknown function B(z): 


(14) — (x) = * an(2)}, 


where the asterisk denotes the operation of convolution and B” is an abbre- 
viation for 


It is understood that, if m = 0, 
(16) B°(x) * A(X) = % (2) 


(and that =). 
It is clear from the definition of the symbols [ ], * that 


(17) =D] Lz]. 


The integral definition of a convolution also shows that, if either of the 


f 
: 
f 

( 
7 
| v 
( 
a 
( 
it 
| 
( 
T 
ol 


ANALYTIC DIFFERENTIAL EQUATIONS. 889 
functions A(z), (a), where < o, vanishes identically for l= 
where c > 1, then the same is true of the function A(x) * (a). These obvious 


facts will be essential in what follows. 


3. In order to find the unknown function B(x), try for it a series, 
(18) B(2) = 3 fx(2). 
Then (16) and (15) show that what (14) requires is 


— ad 3 = day(2) + 4% - 43 Bu (x) } * (2), 


m=1 k=1 
where & £x(x) occurs in { } exactly m times. Since this means that the 
k=1 


m-th { } is 3 ™yz(x), where 
k=1 


and 
(20) = (2) * 


it follows that what (14) requires of (18) is formally equivalent to the 
condition 


— 2d Be + * an (2). 


m=1 k= 


Clearly, the latter condition is satistied if (though not only if) 


(21) — rdB,(x) = da (x) 
and 
(22) — cd =d 3 ™ye(2) * an(2). 


This particular — is made because (22), (20) and (19) together 
obviously have the structure of a set of recursion formulae (which commence 
with (21), the data being the functions %m). The existence proof can be 
based on these recursion formulae, as follows: 


Since l= a< o, it is clear from (3) and (21) that S [a]. 
Similarly, from (22), 


S 3 [om] 


| 
| 


890 EARL A. CODDINGTON AND AUREL WINTNER. 


by (17). Furthermore, from (20) and (17), 


[re] S3 Lees] 


where [*yx] = [x], by (19). Since every [ ] is non-negative, these recur- 
sive inequalities (in which the members could, for the present, be + «) 
imply that the inequalities 


(23) [Bx] S bx (k= 1,2,-- +) 
hold for the sequence b,, b.,- - - defined by the following rule of recursion: 
(24) = > [om by [a], 
m=1 
where 
k-1 
j=1 


But the latter system of recursion formulae has a simple significance. 


4. In order to see this, put 


(26) F(w) [am]w™ 
m=0 

and 

(27) w= bust, 
k=1 


substitute both (26) and (27) into the equation 


(28) w= (w) 

for w=w(z), and compare the coefficients of z,2z*,z°,- -- in the power 
series which thus result on the left and on the right of (28). This supplies 
for the coefficients, b,,b2,- - -, of (27) a recursion formula. But the latter 


is readily seen to be identical with (24) by virtue of (25). 

Hence, the values on the right of the inequalities (23) can be charac- 
terized as the coefficients of the solution (27) of (28), provided that (28) 
has a solution (27). But it does. In fact, since (10), where [am] 20, 
is supposed to hold for some r > 0, the power series (26) is the Maclaurin 
expansion of a function F(w) which is regular at w= 0. On the other hand, 
if G(z,w) = w—zF(w), the partial derivative Gy(z,w) at (z,w) = (0,0) 
is We = 1-40, and so the equation G(z,w) has a (unique) solution 
w—=w(z) which is regular, and vanishes, at z—0. This means that the 
equation (28) is satisfied by a (unique) power series (27) which converges 


is 
sa 


| 
( 
| 
( 


ANALYTIC DIFFERENTIAL EQUATIONS. 891 


near z= 0. Infidentally, since the coefficients, (3), of the power series (26) 
are real and n@n-negative, it follows, by successive differentiations of (28) 
at z= 0, that d*w/dz* =0 holds at z 0 for every &, which means that 
b, = 0 in (27). 

Since (27) has a non-vanishing radius of convergence, (27) converges 
at z= e"' if 1 > 0 is large enough. For such an J, 


(29) [Bile < 
k=1 

by (23). This proves, in particular, that 

(30) < 


i.e., that every term of the series (18) is of bounded variation on the half- 
line 

In order jo complete the proof of the italicized theorem, recourse must 
be had to the$remark made after (17). This remark, when applied to the 
recursion forfiulae (19)-(22) which define B2,--~-, is readily seen to 
imply that 


(31) =0 if lsrsk, 
(12) being Etitina (by assumption) for every a = x. 


5. According to (30) and (3), each of the integrals 


(32) tw(s) e*dBe(2), 


2 


where k =1,2,- - -, is absolutely convergent on the half-plane fis =J1. But 
(31) shows that (32. can be written in the form 


kw(s) = f By (2). 


Hence it is clear from (29) and (3) that the series 
(33) w(s) = *w(s) 


is absolutely-uniformly convergent on the half-plane Sts =/, and that the 
same is true if df;(x) is replaced by | dBx(x)| in (32), (33). 
Accordingly, 


tie 
! 
k 
| 
) 
Ay 


892 EARL A. CODDINGTON AND AUREL WINTNER. 


This implies that the series (18) is convergent for 1 = 2 < o, since (12) is 
satisfied by «=. It also follows that, if B(x) is defined by (18), and it 
(32) is substituted into (33), the term-by-term integration leading from (33) 
to (11) is legitimate on the half-plane Rs = 1. 

Since the recursion formulae defining the terms of (18) were chosen 
so as to lead to an integral (11) satisfying (9), the proof of the italicized 
theorem is now complete, except for the assertion claiming the uniqueness 
of such a solution. But the uniqueness of the solution is easy to see. In 
fact, every function representable, for large Its, in the form (11) is majorized 
by a constant multiple of exp (— ts) as Jis—> 0, and the same holds for 
the derivative of (11). Hence, the uniqueness of the solution (11) follows 
by an obvious adaptation of the standard uniqueness proof in case of a (local) 


Lipschitz constant. 


6. Let the derivative of w; in the differential system (6) be replaced 
by w; itself. What results, the implicit system 


(34) w, = f;(8,W1,° Wn), 
can be treated in the same way as the differential system (6): 


The italicized theorem remains true if, in its wording, (34) 1s read 
instead of (6). , 


In fact, if (6) is replaced by (34), then what corresponds to (19)-(22) 
is identical with the recursion formulae which result if the factor —~< is 
suppressed on the left of (21) and of (22). But this factor (which, since 
|—a|=1, could have been used as divisor promoting convergence) was 
not used at all. Hence, the above proof contains the proof of the last 


italicized theorem. 


THE JoHNS HOPKINS UNIVERSITY. | 


this 


th 
cc 
cc 
he 
pe 
in 
| of 
(1 
re 
in 
va. 
| 
| wh 
an 
an 
no 
pay 
E 
in 
= 


aS 


ON ANALYTIC VARIETIES.* 


By Wei-Liane CHow. 


1. Introduction. Let 8, be the projective space of n dimensions over 
the field & of all complex numbers. For any given choice of the inhomogeneous 
coordinates 2,,° in Sp, let An(a) be the n-dimensional complex affine 
space consisting of all the points of 8, which are finite with respect to this 
coordinate system. A point set U in S, is said to be analytic in the neighbor- 
hood of a point (a) if, choosing the inhomogeneous coordinates so that (a) isa 
point in A,(a), there exists a neighborhood R of (a@) in An(#) such that the 
intersection U () R consists of all the points of R which are common solutions 


of a finite set of equations, 
(1) fi(21,° = 0, (t=1,:- *;8), 


where the fi(2.,° *.2n), i= 1,:°*,8, are holomorphic functions in the 
region R. A point set in S, is called an analytic variety if it is analytic 
in the neighborhood of every one of its points. An analytic variety is called 
compact if it is a compact point set. A point set in S, is called an algebraic 
variety if it consists of all the points whose homogeneous coordinates 
To, X,° * *, Xn Satisfy a finite set of equations 


where the $;(%, *,%n), 1=1,- +,¢, are homogeneous polynomials in 
the 2,%,,° **,2n. It is easily seen that an algebraic variety is a compact 
analytic variety. ‘There is a classical conjecture that conversely any compact 
analytic variety in S,; is also an algebraic variety. So far as we are awere, 
no proof of this conjecture has ever been offered. It is the purpose of this 
paper to present a proof of this conjecture. 

We begin by recailing some well known properties of an analytic variety.* 
Let R be a neighborhood of the origin of An(a) define’ by the conditions 
Let k{x,,---,2,} be the rin§of all power series 
in the variables x,,- -,2- which are convergent in region |x| <e, 
1=1,:--:,r. Consider a set of equations of the follogg type 


* Received April 9, 1949. 
1¥For the results of the theory of functions of several complex variables used in 
this paper, we refer here once for all to Osgood [7], Bochner and Martin [1]. 


893 


1S 
n 
d 
n 
d | 
d | 
| 
a 
| 
|| 


894 WEI-LIANG CHOW. 


(2) + + Bu = 0 
Hx; = G;(2r1), (j=r+2,: 


where the B,,- - -, By and IJ are elements of the ring k{2,,- - -,2,} and 
the 2,- are polynomials in 2,,, with coefficients in 
k{x,,- + +, %,}; and the left hand side of the first equation is an irreducible 
distinguished polynomial in z,,, over &{2,- - -,2,} and H is its discriminant. 
Let D be the set of all common solutions of the equations (2) in the region R 
for which H = 0, as weil as those which are limiting points of such solutions. 
Such a point set D, or any point set in S, which can be represented as such a 
set by a suitable choice of the affine space An(x) and its coordinates, is called 
an analytic element in S,. The number r is called the dimension of the 
analytic element D), and we shall write D, to indicate this. The number p 
depends in general upon the choice of the affine space and its coordi- 
nates; the smallest possible value of this number for a given D, is 
called its order. An analytic element is called regular if it has the 
order one, otherwise it is called singular. The point (a) is called the center 
of D,. It is well known from the theory of functions of several complex 
variables that every point of an analytic variety W has a neighborhood F such 
that R {) W consists of a finite number of analytic elements. In other words, 
an analytical variety is a topological sub-space of S, which has a system of 
neighborhoods consisting of anaiytic elements. An analytic variety is called 
irreducible if any two analytic elements of it can be obtained, from each other 
by analytic continuations. It is easily seen that all the analytic elements of 
an irreducible analytic variety W must have the same dimension r; we shall 
call this number r the dimension of W and we shall write W, to indicate this. 
Following a recent practice in algebraic geometry, we shall from now or. use 
the expression “analytic variety ” to denote exclusively an irreducible analytic 
variety ; the reducible ones being taken care of later by the more precise con- 
cept of an analytic cycle. A point of an analytic variety W, is called regular 
if it has a neighborhood consisting of a single regular analytic element ; those 
points of W, which are not regular are called singular. The set W, of all 
singular points of W, constitutes a finite or enumerably infinite number of 
analytic varieties of dimensions less than 7; we shall call this set W, the 
singular part of W,. The set W,—W, of all the regular points of W, is a 
connected set. 

For our present purpose the most important properties of an analytic 


variety W, are the following. 


(A) W, is a topological complex. This is the triangulation theorem for 


an. 
pa 
C0) 
sir 


ele 
wil 
an 
gat 


pre 
dir 
fin: 
giv 
ess 
we 
the 
the 
pro 
var 
dia 
apy 


top 
we 
as 
bee 
de 

be 

fur 
top 
ser] 
at 
top 


ON COMPACT COMPLEX ANALYTIC VARIETIES. 895 


analytic loci, for the proof of which we shall refer to the literature.? In 
particular, if W, is compact, it is a finite complex and can be taken as a sub- 
complex of a simplicial subdivision of the projective space S, and the 
singular part W, can in turn be taken as a subcomplex of W,. 


(B) W, can be covered by an enumerable aggregate of regular analytic 
elements of dimensions =r. This can be easily proved by induction, starting 
with the facts that W,— W, can be covered by an enumerable aggregate o° 
analytic elements of dimension r and that W consists of an enumerable aggre- 
gate of analytic varieties of dimensions < r. 

In the following sections we shail show that the main theorem can be 
proved by using these two properties (A) and (B) alone, without referring 
directly to the fact that each point of W, has a neighborhood consisting of a 
finite number of analytic elements of dimension r. For this reason, we shall 
give a new (possibly more general) definition of an analytic variety in 2, 
essentially by means of these two properties (A) and (B) only. In 2 and 3 
we shall study the topological properties of analytic varieties and derive from 
them the Theorem III, which expresses a special property of a regular analytic 
element D, contained in a compact analytic variety of r dimensions. In 4 we 
then proceed to show (‘Theorem IV) that this property is really a characteristic 
property of a regular analytic element D, which is contained in an algebraic 
variety of r dimensions. The main theorem (Theorem V) then follows imme- 
diately from these two results. In the last section (5) we shall indicate some 
applications of our main theorem. 

Finally, we add a few words about the terminology and notations. A 
topological complex K is a homeomorphic image of an Euclidean complex, and 
we shall say 27-complex if it consists only of simplexes of topological dimension 
2r and their sides. The boundary of an unoriented complex X (i. e. considered 
as a chain modulo 2) will be denoted by K. If the complex K is oriented to 
become a chain with integral coefficients, then its chain boundary will-also be 
denoted by K. There is no danger of confusion in this, as the meaning will 
be clear from the context in each case. The word “dimension ” without any 
further qualification will mean the complex dimension, which is twice the 
topological dimension; subscripts will be used to denote the former, super- 
scripts the latter. 


2. Analytic varieties. An analytic simplex /, of r dimensions in S, is 
a topological 2r-simplex in Sn, which is a one-to-one analytic image of a 
topological 2r-simplex in a complex affine space A,(z) of r dimensions. More 


* See Koopman and Brown [4], Lefschetz and Whitehead [6]. 


896 WEI-LIANG CHOW. 


specifically this means that, for a suitable choice of the inhomogeneous 
coordinates 2,,°-*, 2» in Sp», the points of £, can be represented parametrically 
by a set of nm equations 


where the n functions - -,2-), ,m, are all holomorphic in 
a region of the affine space A,(z) of complex parameters z,,- - Zr, and 
the matrix || 0f;/0zx ||, 7 =1,---,n,k=1,---+,7r, has the maximum rank r 


at every point of the region R; and that this representation induces a homeo- 
morphism between /, and a topological 2r-simplex C?" in the region R. 
Without loss of generality, we can assume that the coordinate origin (z) = (0) 
of A,(z) is contained in the interior of C*"; the corresponding point (2) = (a) 
in F, is then called the center of ZH, (with respect to the parameters 2,°° -, 
z,). It is obvious that if the analytic simplex /’ is subdivided into a complex, 
then each simplex of this complex is also an analytic simplex. Since in all 
our arguments we can always replace any complex by a subdivision in which 
the simplexes are arbitrarily small, we can therefore assume, without any 
essential restriction to the concept of an analytic simplex, that for a suitable 
choice of the coordinates 2,,-*-,%n, and the parameter space Ar(z), the 
equations (3) have the form 


(4) 


and that the region R has been chosen so small that the functions fr.1,° * *, fr 
can be represented as power series which are convergent in R. Since the 
equations (4) map the region R into a regular analytic element D, in Sn, 
it follows that an analytic simplex HZ, is simply a topological 2r-simplex 
imbedded in a regular analytic element D,. On the other hand, the set of 
interior points H,—, of an analytic simplex is itself a regular analytic 
element. Hence it is rather indifferent whether a point set is covered by an 
aggregate of analytic simplexes or an aggregate of regular analytic elements; 
the use of either one instead of the other is only a matter of convenience. 
The concept of an analytic complex in 8S, is defined by induction as 
follows. Any topological 0-complex in S, is an analytic complex Ky of zero 
dimension. A topological 2r-complex (finite or infinite) in S, is called an 
analytic complex K, of r dimensions, if it contains a subcomplex K,. which 
consists of an enumerable aggregate of analytic complexes of dimensions less 
than r, such that any point of K,— (Kr + K,) has a neighborhood which is 
a regular analytic element D,. The points of K,— (Kr+ K,) are called 


in 


reg 
of 
K 
po 
is 
bo 
W. 
the 
int 
ze 
col 
co 
Fr 
va 
W, 
si0 
en 
of 
ho 
W 
of 
ca 
an 
tw 
ag: 
di 
the 
It 
res 
ha 
ca] 
an 
E, 
CO} 


ON COMPACT COMPLEX ANALYTIC VARIETIES. 897 


regular points of K. The topological complex K; is also called the boundary 
of Ky. The aggregate of analytic complexes K, is called the singular part of 
K,, though not every point (or even any point) of it need be a singular 
point in either the topological or analytic sense. An analytic simplex J, 
is of course also an analytic complex of r dimensions, the boundary being the 
boundary #, and the singular part being the empty set. An analytic variety 
W, of r dimensions in S, is a connected analytic complex with the properties 
that it has no boundary and its singular part W, consists of an enumerable 
aggregate of analytic varieties of dimensions less than r. This definition by 
induction can be completed by the stipulation that an analytic variety W, of 
zero dimension consists of a single point in S,. An analytic variety is called 
compact if it is a compact point set; that is, if it is a finite topological complex. 

It is clear that the set W,— W, of all the regular points of W, can be 
covered by an enumerable aggregate of analytic simplexes of r dimensions. 
From this and the fact that W, consists of an enumerable aggregate of analytic 
varieties of lower dimensions, it follows by induction that the analytic variety 
W, can be covered by an enumerable aggregate of analytic simplexes of dimen- 
sions Sr. However, it is to be noticed that this covering of W, by an 
enumerable aggregate of analytic simplexes is not a simplicial subdivision 
of W, as a complex in the topological sense, nor is it a covering by neighbor- 
hoods of W, as a topological subspace of S,. It is simply a representation of 
W, as the set-theoretic sum of an enumerable aggregate of analytic simplexes 
of various dimensions, and this aggregate will be in general infinite even in 
case of a compact analytic variety W,. 

Let FL. and LZ, be two analytic simplexes in S, contained in the regular 
analytic elements D, and D, respectively. The intersection D, {| Ds of the 
two analytic elements D, and D,, if it is not empty, consists of an enumerable 
aggregate of connected components, each of which is an analytic variety of 
dimension not less than r++ s—vn. In case such a component has’ exactly 
the dimension r+ s — n, it is called a proper component of the intersection. 
It is easily seen that if the join of the tangent spaces of D, and D, 2+ some one 
regular point of such a component has the dimension n, then the component 
has the dimension r + s — n and is therefore proper. In such a case we shall 
call the component regular. We shall say that two analytic simplexes L, 
and FE, are regular with respect to each other if either (1) the intersection 
E,(\ E, is empty, or (2) each point of £,{) £, is contained in a regular 
component of D, {) Ds. 

It is clear that for r+ s < n the first possibility alone can occur, so that 
in this case two analytic simplexes #, and F, are regular with respect to each 


898 WEI-LIANG CHOW. 


other if and only if they are disjoint. In case r+ s==n, the intersection of 
two analytic simplexes FE, and £,_,, regular with respect to each other, is 
either empty or consists of a finite number of points at each of which the 
tangent spaces of D, and D,., are transversal to each other. We are mainly 
interested in the case r-+-s =n, though the following results hold also for 
the general case. 

Let P be the group of all projective transformations in S,; it is an 
analytic manifold of (n+ 1)?—-1 dimensions. It is easily seen that if F, 
is an analytic simplex in S, and T is a projective transformation in S,, then 
the transformed set TE, of EH, is also an analytic simplex of r dimensions. 
It follows then that if K, is an analytic complex covered by an aggregate of 
analytic simplexes {#}, then 7'K, is also an analytic complex covered by the 
aggregate of the transformed simplexes {7}. In the proof of the following 
theorems, we shall make use of the simple fact about everywhere dense sets 
which is true for every regular separable topological space: The intersection 
of a finite or enumerable infinite number of everywhere dense open sets is 
also an everywhere dense set. Since a closed set is nowhere dense if its 
complement is everywhere dense, it follows that the sum of a finite or enumer- 
ably infinite number of nowhere dense closed sets is a set the complement 
of which is everywhere dense; hence, if this sum is a closed set, then it is 
also nowhere dense. 


*THeoreM I. Let LE, and E, (r+sSn) be two analytic simplexes in 
S, and let Q be the set of all elements T of P such that FE, and TE, are 
regular with respect to each other, then Q is an everywhere dense open set in P. 


Proof. We shall choose the inhomogeneous coordinates 2, --,2, in 
S, in such a way that both £, and £, are in the affine space A,(x), which is 
always possible if these simplexes are sufficiently small. Let HZ, be given by 
the one-to-one analytic mapping 


(5) = * 2), t=1,---,N, 


of a region R’ in A,(z) into An(x), so that HZ, is the image of a topological 
2r-simplex C*” in R’. Similarly, let HZ, be given by the one-to-one analytic 
mapping 


(6) X= Gi(Wi,° We), i=—1,- 


of a region R” in A,(w) into An(x), so that H, is the image of a topological 
2s-simplex C?* in R”. 
We first prove that the set © is everywhere dense in P; it is evidently 


th 


e 
it 
t 
ti 
0 
( 
ol 
in 
re 
(1 
ul 
e} 
( 
81] 
| 
ca 
th 
th 
in 
Wi 
of 
|: 


ON COMPACT COMPLEX ANALYTIC VARIETIES. 899 


enough to show this for the neighborhood of the identity transformation JI. 
Let Te) be the affine transformation of A,(z2), 


it is of course also a projective transformation in S, and is arbitrarily near 
to the identity transformation J for points (e) sufficiently near to (0). The 
transformed analytic simplex 7,)H, is then given by the one-to-one analytic 
mapping 

= Gi(W1,° Ws) + Gi, 


of the topological simplex C** in the region R”. Consider the analytic 
mapping 
(7) = Fi (2z,w) = fi(a,° * 5 We), 


of the region R = R’ & R” in the product space Ay.is(2,w) = Ar(z) XK As(w) 
into the space A,(w). Let be the 2(r s)-simplex C?" in the 
region FR. It is clear that every common solution of the equations (7) for 
(w) = (e) in the simplex C?"*** will correspond to an intersection point of 
FE, and TH, and conversely. For r+s <n, the image of C?"*** in An(w) 
under the analytic transformation (7) is a nowhere dense set; hence there 
exist points (e) in A,(w) arbitrarily near to the origin such that the equations 
(7) have no common solution in C?"*?* for =(e). Then the analytic 
simplexes EF, and T,¢)H; will be disjoint and hence regular with respect to 
each other. Thus the assertion is proved for this case. For r-+s—n, we 
can assume that for all points (e) in a sufficiently small neighborhood of (0) 
the two analytic simplexes E, and T,e)H; have an intersection, for otherwise 
the assertion is obviously true already. This means that for all points (w) 
in a sufficiently small neighborhood of the origin in A,(w) the equations (7) 
will have a common solution in C*"*?8, It is well known * that in this case 
the Jacobian determinant 


J (2, w) | OF ,/dw;, | = | Of;/02;, — | 


does not vanish identically in the region R. Then the set of all the solutions 
of the equation J(z,w) —0 in R, if it is not empty, constitutes an enumerable 
aggregate of analytic varieties of dimension n—1 in R. Let EF, 1—1, 2, 
- ++, be a sequence of analytic simplexes covering this set J, and let EH‘, 
t=1,2,--+, be the corresponding images in A,(w) under the analytic 
transformation (7). Then the image J* of J under the analytic transforma- 


® See Knopp and Schmidt [3], p. 379. 


| 


900 WEI-LIANG CHOW. 


tion (7) is evidently the sun: of all the sets H‘)*, i =1,2,---. Since each 
LE” is a nowhere dense set in A,(w), it follows that the complement of J* 
in A,(w) is an everywhere dense set. This means that there exists in every 
neighborhood of the origin of A,(w) at least a point (e) which is not in J*, 
Then for this point (wv) = (e) the equations (7) will have no common solution 
in the set J; or, in other words, we have J(z, w) 0 at every common solution 
in of the equations (7) for = (e). This implies that every inter- 
section point of H, and T(e)H, corresponds to a point in C**** at which 
J(z,w) ~0; hence H, and T(¢)H, are regular with respect to each other. 
Thus our assertion is also proved for the case r+ s =n. 

It remains to show that the set © is open in P; that is, we have to show 
that if #, and EF are regular with respect to each other, then EL, and TE, are 
also regular with respect to each other for all T in a sufficiently small neighbor- 
hood of the identity J. For r+s <n, this is obvious; for the H, and £, 
are disjoint closed subsets in S,, and 7, varies continuously with T. To 
prove the assertion for r+ s—n, let a projective transformation T be given 
by the equations 


n n 
j=1 j=1 


If the matrix (aj;) is sufficiently near the identity, then the analytic simplex 

TE, will still lie in the affine space A,(av) and is given by the one-to-one 

analytic mapping 2 

j=1 j=1 


+=1,---,N, 


of the topological simplex C?* in the region R”. Consider the Jacobian 


determinant 
JT (z,w) = | | 
and the n functions 


which are all defined in the region R. It is clear that these functions are 
also continuous functions of the transformation T in a neighborhood of the 
identity I. Now, it is easily seen that the two analytic simplexes E, and 
TE, are regular with respect to each other if and only if the following 


equations 


J™(z,w) =0, 
F;?(z,w) =0, 


(8) 


| 
{ 
|; | 
Q 
Te 
al 
| 
ac 
| 
for 
Le 


ON COMPACT COMPLEX ANALYTIC VARIETIES. 901 


have no common solution in the topological simplex C*"*?* in the region R. 
Let A” be the set of all the common solutions of the equations (8) in R; 
then A? is closed in R and varies continuously with T. Since H, and E, 
are assumed to be regular with respect to each other, the intersection 
0?r+?8 () AT is empty. Now, if O?r*?*(] AT is not empty for all T in a 
sufficiently small neighborhood of J, then there is a sequence of T), 
t=1,2,---, converging toward J such that C?"***(]) AT is not empty. 
Since C?"*?8 is a compact set, the sequence of sets C?"*?* (] AT has at least 
one limit point in C*"**s; and it is easily seen that this limit point is a 
point of C+? (] A’, which is a contradiction. Therefore, the intersection 
o?rs () AT is empty for every 7 in a sufficiently small neighborhood of the 
identity 7; and this means that FH, and TE, are regular with respect to each 
other for all such 7. This concludes the proof of Theorem I. 

From Theorem I we can deduce a similar theorem for analytic varieties. 
We shall say that a point is a regular intersection of two analytic complexes 
K, and K,_, if it is a regular point of both K, and K,_, and if the tangent 
spaces of K, and K,_, at this point are transversal to each other. 


THEOREM II. Let W, and W,, be two analytic varieties in Sn, and 
let Q be the set of all the transformations T in P such that W, and TW,_, 
have only regular intersection points. Then the set Q is everywhere dense 
in P. 


Proof. Let {£} and {F} be two enumerable sets of analytic -simplexes 
which cover the analytic varieties W, and W,_, respectively. Then the set 
{G} = {(E£, F)} of all the pairs of analytic simplexes, one from each of the 
two sets {#} and {F}, is also enumerable and hence can be arranged in a 
sequence @ == (Hi), ¢—21,2,---. For each 1—1,2,---, let 
0) be the set of all transformations T in P such that FE“ and TF) are 
regular with respect to each other. It is easily seen that the intersection of 
all the sets O°), 1 1,2,---, consists of exactly those transformations T 
such that W, and J7W,_, have only regular intersection points. Since, 
according to Theorem I, each set 2) is an everywhere dense open set in P, 
it follows that © is also an everywhere dense set in P. 


3. Intersection of analytic varieties. There is a “natural” orientation 
for an analytic simplex F, which can be extended to any analytic complex K,. 
Let E, be an analytic simplex in S, defined by the one-to-one analytic 
mapping (3) of the topological 2r-simplex in A,(z). Let 2) = 2; + 
j=1,--+,7r; then the ordered set of 2r real coordinates 


e 
n 
d 


902 WEI-LIANG CHOW. 


2%’, of the space A,(z) determines a definite orientation of this space and 
consequently also an orientation of the simplex C?r. This orientation of 0 
determines an orientation of F, which we shall call its natural orientation. 
It can be easily shown that the natural orientation is independent of the 
choice of the analytic parameters and is thus an intrinsic property of an 
analytic simplex. The set of all regular points K,— (K,+ K,) of an analytic 
complex K, can be covered by a set of analytic simplexes of r dimensions. It 
can be shown that if each analytic simplex of the set is given its natural 
orientation, then all these orientations are coherent; and if we orient each 
topological 2r-simplex of K, in concordance with the natural orientation of 
any one analytic simplex EF, contained in it, we obtain a topological 2r-chain 
on K, with boundary in the sub-complex K,, or a 2r-cycle mod K,. Thus 
an analytic complex K, is an orientable pseudo-manifold, and it has a natural 
orientation determined by its analytical structure. This applies in particular 
also to the projective space S, itself, since it is also an analytic complex. 
Furthermore, if K, is a finite complex, its natural orientation will make it a 
topological 2r-chain in S, with boundary in the subset K,. Hence a compact 
analytic variety W,, if oriented with the natural orientation, is a topological 
2r-cycle in Sy. We shall from now on assume that all analytic complexes, 
including the space S, itself, are oriented with the natural orientation, so 
that a finite analytic complex K, corresponds to a uniquely determined 
topological 27-chain in S,. The significance of the natural orientation is given 
by the following well-known result: 


Lemma 1. The topological intersection multiplicity of a regular inter- 
section point of two finite analytic complexes K, and Ky-, ts +- 1. 


For the proof of this lemma, which is very simple, we refer to the 
literature.* 

As to the topological properties of S,, it is well known that the 2r-th 
homology group of S,» is cyclic and the class determined by a linear analytic 
variety LZ, of r dimensions is a generator of this group. From this we can 
derive at once the following consequences: (1) To each 2r-cycle in S,, there 
is associated an integer g called its degree; it is the total topological inter- 
section number of this 2r-cycle with a linear variety Ly--; (2) two 2r-cycles 
are homologous if and only if they have-the same degree; (3) the total 
topological intersection number of a 27-cycle of degree g and a 2(n —r)-cycle 
of degree h is equal to gh. In view of Lemma 1 we can conclude that the 


‘For the topological properties of analytic varieties, we refer once for all to 
Lefschetz [51, Ch. VTL, § 3 and van der Waerden [9]. 


d- 


( 
1 
d 
~ d 
A 
0 
be 
c 
0g 
kr 
an 
Le 
nu 
A 
TY 
ha 
the 
the 
poi 
sec 
W, 
(p 
var 


ON COMPACT COMPLEX ANALYTIC VARIETIES. 903 


degree of a compact analytic variety W, is always positive; for it is equal 
to the number of intersection points of W, with a linear variety Ln_- which has 
only regular intersections with W,. 

We shall need the following lemma concerning the multiplicity of an 
intersection point. Let B* and B?"-* be two topological chains in S,, and let 
(p) be an isolated point of B* () B2-* which does not lie in either B® or 
Bs, Then this point (p) has a uniquely determined multiplicity as an 
intersection of the two chains B* and B?"-*, For the determination of this 


multiplicity we have the following criterion: 


LeMMA 2. Let R be a neighborhood of (p) such that the closure of R 
does not intersect B* or B"-* or any other component of the intersection 
Be () B*"-8, and let A’ and A2"-8 be two chains which are d-homologous to the 
chains B® and B?"-* respectively, where d is a sufficiently small positive number 
depending on B%, B°"-* and R.. Then each component of the intersection 
A® () A"-* lies either in R or outside of R, and the intersection multiplicity 
of (p) ts equal to the sum of the intersection multiplicities of those components 
of which lie in R. 


In the above lemma the notion of d-homology is defined as follows: Let & 
be a d-neighborhood of B® and ’ be a closed d-neighborhood of B*, then a 
chain A® is said to be d-homologous to B* if it is contained in = and is homol- 
ogous to B* mod &’ on &. This includes in particular the case when A® is a 
d-deformation of B*. The proof of Lemma 2 follows easily from. the well 
known results about intersection of chains,® so that we can omit it here. 

Let W, and W,., be two compact analytic varieties of degrees g 
and h respectively, and let (p) be an isolated intersection point of them. 
Let R be a neighborhood of the point (p) in S,, and let d be the 
number such that the Lemma 2 holds for all d-deformations of W, and Wy,_,. 
According to Theorem II, there exists a projective transformation such that 
TW,» is a d-deformation of W,_, and the two analytic varieties W, and TW,_, 
have only regular intersection points. Since according to Lemma 1 each of 
these intersection points has exactly the multiplicity + 1, it follows that 
the intersection W,{] TW,., consists of exactly gh points. If » of these gh 
points lie in the neighborhood R, then » is the multiplicity of (p) as an inter- 
section of W, and W,_,. Applying this in particular to a regular point of 
W,, we obtain the following result: Let Z, be an analytic simplex in S, and 
(p) be an interior point of H, If FH, is an element of a compact analytic 
variety W,, then there exists a positive number N such that if Wn, is a 


5 See Lefschetz [5], Ch. IV, § 3. 


11 


1d 
n, 
e 
n 
ic 
It 
al 
of 
n 
1s 
r 
a 
| 


904. WEI-LIANG CHOW. 


compact analytic variety of degree h and if (jp) is an isolated intersection 
point of #, with W,_,, then the multiplicity of this intersection is not greater 
than hN. In fact, we can take N to be any number not less than the degree 
of W,. In the next section we shall see that this fact is characteristic of an 
analytic simplex which is an element of an algebraic variety of r dimensions. 
However, in order to state this result in the definitive form which we shall use 
in the next section, we introduce the concept of an analytic cycle. An analytic 
cycle Z, of + dimensions in 8, is a topological 2r-cycle in 8, which can be 
expressed as a sum of multiples of compact analytic varieties. Thus an 
analytic cycle Z, is a finite set of compact analytic varieties W,",- - -, W,), 
to each of which is assigned an integer n; (positive or negativ:) called its 
multiplicity. If g; is the degree «1 W,'” respectively, then the integer > NiGi 1s 
evidently the degree of the analytic cycle Z,. An analytic cycle Z, is called 
positive if the multiplicities of its component varieties are all positive; it is 
called algebraic if all of its component varieties are algebraic. It is a simple 
matter to show that what we have proved above for compact analytic varieties 
can be generalized to positive analytic cycles. Thus we have the following 
theorem : 


THEOREM III. Let E, be an analytic simplex in Sy and (p) be an 
interior point of E,. If E, is an element of a compact analytic variety W,, 
then there exists a positive number N such that if Zn is a positive analytic 
cycle of degree h and if (p) is an isolated intersection point of E, with Zn-1, 
then the multiplicity of this intersection is not greater than hN. 


We have stated the above theorem only for a regular analytic element, 
as this is the only case we shall need for our purpose. It is however clear 
that the theorem holds in general for any analytic element, whether regular 
or not. Incidentally we should like to remark that it is well known ® that 
an isolated intersection of two analytic varieties, in the usual sense as defined 
in section 1, has always a positive multiplicity. This property is fundamental 
for many applications of topological methods to algebraic geometry, but we 
have no direct use for it in the present paper. In fact, we do not know whether 
this property is true at all for the analytic varieties as we have defined here in 
the previous section ; for the proof of it depends on the fact that the neighbor- 
hood of each point of an analytic variety consists of a finite number of analytic 
elements. What is essential for our present purposes is not to rule out the 
possibility of a zero multiplicity for an isolated intersection point, but to 


®See van der Waerden [9], or Lefschetz [5], Ch. VIII, § 4. 


r 
p 
re 
i 
G 
fe 
F 
pe 
h 
m 
co 
lir 
ha 
ca 
(n 
her 
he 
kn 
W 
int 
int 
to 
geo 
Let 
var 
to 
x 
i=1 
“tg 


ON COMPACT COMPLEX ANALYTIC VARIETIES. 905 


rule out the possibility of a negative multiplicity for any connected com- 
ponent (of any dimension) of the intersection. That the latter is true for 
intersections of positive analytic cycles in S, can be easily deduced from our 
results, but it is not generally true for intersections of positive analytic cycles 
in an arbitrary analytic or algebraic manifold. 


4, Proof of the main theorem. We begin with two lemmas. 


LeMMA 3. Let f(2,- be a power series in r variables +, Xp. 
Given any positive integer N, we can always find an integer M with the 
following property: For every integer m>M, there exists a polynomial 
r+ 1 variables +, Lr, of degree m such that the 
power series +, 2%,,f(@,° contains no terms of degree mN. 


Proof. Let F'(21,- + -,%r41) be the general polynomial of degree m in 
r+1 variables If denotes the binomial coefficient 
h!/j7!(h—j7)!, then F contains ™***" @,,, terms and hence that many indeter- 
minate coefficients (c). The power series F(2,,: -,2,,f(%1,° *,2r)) 
contains at most **”"’ @, terms of degrees = mN, the coefficients of which are 
linear forms Z)(c) in the (c). The *"’@, linear equations [,(c) =0 will 
have a non-trivial solution in the (c) if ™**"@,., >" @,. This will be the 
case if we take m greater than M = (N --1)"(1-+ 1); for we have then 


(m+1) > (1+ +1)"/mr 


hence [(m+r+1)---(m+1)]/[r+1] > (mV +1r)---(mN +1), 
hence > rimN 

Before proceeding to the next lemma we shall recall here some well- 
known facts about the intersection of two algebraic cycles Z, and Z, in Sn. 
While in general any two arbitrary cycles C’ and C” in S, have a “ topological 
intersection ” consisting of a homologous family of cycles about their geometric 
intersection C0’ {| C”, it is possible in case of the algebraic cycles Z, and Z, 
to define an intersection cycle Z,-Zs provided all the components of the 
geometric intersection Z, {| Z, are proper (i.e. have the dimension r + s — n). 
Let U,---,U® be the components of Z,{] Zs; they are all algebraic 
varieties of dimension r-+-s—n. Then we can assign in a unique manner 
to each component U‘*) a positive integer n; such that the algebraic cycle 
a is a member of the homologous family of cycles constituting the 
=1 
“topological intersection” of Z, and Zs. This cycle ig called the 


i=1 


r 
e 
n 
e 
e 
e 
‘ 
r 


906 WEI-LIANG CHOW. 


intersection cycle Z,-Z, of Z, and Z,, and the numbers -, are called 
the intersection multiplicities of U™,- - -,U respectively. The degree of 


Z,-Z, is the product of the degrees of Z, and Zs. Corresponding to the 
Lemmas 1 and 2 of the preceding section, we have here the following result 
about the intersection cycle Z,- Zs: (1) If V, and V, are component varieties 
of multiplicity one of Z, and Z, respectively, and if U is a regular component 
of V,{) V, and is not contained in any other component of Z, or Z,, then 
the multiplicity of U in the cycle Z,-Z, is equal to + 1. We shall say the 
U is a regular component of V,{) V, if it contains a point which is regular 
for both V, and V, and if the join of the tangent spaces of V, and V, at 
this point has the dimension n. (2) Given any positive number d there 
is a positive number e (depending of course also on Z, and Z;) such that if 
the algebraic cycles Z’, and Z’, are e-homologous to Z, and Z, respectively, 
then the cycle Z’,-Z’, is d-homologous to the cycle Z,-Z,. All these hold 
in fact (with slight modifications) also for the intersection of analytic 
cycles and complexes, but we shall not need them here. 


Lemma 4. Let E, be an analytic simplex in Sy, with center at (p) and 
let In-ri1 be a linear variety containing the point (p) such that the inter- 
section of E, with Ln-+1 1s an analytic simplex E, with center at (p). Ifa 
positive algebraic cycle Zn, of degree m has an isolated intersection with EF, 
at the center (p) with a multiplicity p, then the intersection cycle of Zn» 
and In-y.1 is a positive algebraic cycle Zn, of degree m which has an isolated 
intersection of multiplicity p at the point (p) with E,. 


Proof. The linear variety Ln_,,, does not lie in any component hyper- 
surface of Z,_.; for otherwise the entire analytic simplex H#, would lie in 
Zn-1, in contradiction to our assumption that Z,_, has an isolated intersection 
with F, at the point (p). Therefore every component of the intersection 
Ln-rsa (1 Zn-1 is an algebraic variety of dimension n —r and hence is proper. 
Therefore the intersection cycle of Ln, and Zn, is a positive algebraic 
cycle Z,_, of degree m, and it remains only to show that this cycle Z,_, inter- 
sects H, at the point (p) with the multiplicity ». It is obviously enough to 
prove the assertion for the case when Z,-_, is an irreducible hypersurface, for 
otherwise we can apply the same argument to each irreducible component of 
Z,.+. The assumption that Z,_, intersects. £, at the center (p) with the 
multiplicity » implies that there is a projective transformation arbitrarily 
near to the identity and such that TZ,_, has exactly » regular intersections 
with F, which lie in any given sufficiently small neighborhood FR of (p). For 
such a T the components of TZn-1{) Dn-r., are all proper and hence there 


is 
in 
co 
m 
ww 
( 
to 
re 
ne 
th 
cle 
th 
is 
| an 
th 
fo 
Ln 
ex 
de 
m 
ele 
If 
pos 
the 
m 
An 
eq 
wh 
a 
ca 
ar 


ON COMPACT COMPLEX ANALYTIC VARIETIES. 907 


is an intersection cycle Z",_, = TZy_1* In-r41; and since each of the » regular 
intersection points of 7Z,_, with EF, is contained in exactly one regular 
component of TZ. {) Ins, it is contained in exactly one component of 
multiplicity one of the cycle Z7,_,.. Let the positive number d be chosen 
with respect to the chains Z,_, and E, and the neighborhood R of the point 
(p) as in Lemma 2, and let the positive number e be chosen with respect 
to the number d and the intersection cycle Zn--=Zn1* Dns. aS in the 
remark (2) immediately preceding this Lemma. Since for 7’ sufficiently 
near to the identity the cycle TZ,_, is an e-deformation of Z,_,, it follows 
that for such a T the intersection cycle Z7y_, is d-homologous to Zn_,. It is 
clear that the part of the intersection 77,_,{] £, which lies in R consists of 
the same yp intersection points of TZ,_, and E,. Since each of these points 
is contained in exactly one component variety of multiplicity one of Z7_,, 
and since it is also a regular intersection of this component variety with £,, 
therefore it is an intersection of multiplicity one of Z7,, and £,. It 
follows then from Lemma 2 that the multiplicity of the intersection (p) of 
and is equal to 

With the help of these two lemmas we can prove the following theorem: 


TuHeorEM IV. Let EF, be an analytic simpler in Sy. Suppose there 
exists a positive integer N such that if any positive algebraic cycle Zy+ of 
degree m in S» has an isolated intersection with E, at its center, then the 
multiplicity of this intersection is at most equal to mN. Then E, is an 
element of an algebraic variety V, in Sn. 


Proof. The theorem is evidently equivalent to the following statement: 
If HZ, is not an element of an algebraic variety V, in S,, then given any 
positive integer N there exists a positive algebraic cycle Z,_, of degree m such 
that Z,, has an isolated intersection with EF, of multiplicity greater than 
mN at the center of £,. 

Let D, be the analytic element containing £,. Let the affine space 
A,(z) be so chosen that the analytic element D, is given by the set of 
equations 


where the 2), are convergent power series in 
a region in the affine space A,(x) of the r complex variables 2,-- -,2. We 


can assume without any loss of generality that FZ, is the image of a simplicial 
2r-simplex C?r in A,(x) and that the center of H, is the origin of A,(2). 
Since the analytic simplex Z,, hence also the analytic element D, is not 


ad 
of 
; 
lt 
28 
t 
n 
r 
t 
e 
f 
C 
j 
1 
1 
\ 


908 WEI-LIANG CHOW. 


algebraic, at least one of the power series 241 = * 
In =fn(%,° say the power series 241 = fri (%1,° *, Zr), is not an 
algebraic element over the field &(z,,- - -,2,). According to Lemma 83, there 
exists to any given positive integer NV a polynomial F(x) = 
of degree m (> MM) in the r +1 variables 2,,- - -, 2,41, such that the power 
series P(2,,- -,%-) =F +, *,%r)) contains no terms 
of degree = mN. Since the element f,.:(@1,° > +,2,-) is not algebraic over 
the power series is not identically zero, 
Hence, after a suitably chosen linear transformation in A,(a) if necessary, 
we can assume that the power series P(2,) = P(2,,0,- + -,0) is not iden- 
tically zero. It is clear that this power series P(z,) contains no terms of 
degree = mN. Let Ln, be the linear variety in S, defined by the r—1 


linear equations 0,- - -,2,= 0, then the intersection of Ln... with D, 
is a regular analytic element D, given by the equations 7,—0,- - -,2,=0, 
= (41, 0,° -,0),° +, = and the intersection 


of Ln-ri. with E, is an analytic simplex F, contained in D,, which is 
the image of the topological 2-simplex C? obtained from C?" by intersection 
with the subspace = 73 =: - -=a,==0in A,(z). (Thus C? is a 2-simplex 
in the z,-plane.) 

Consider now the homogeneous algebraic equation 


it defines a positive algebraic cycle Z,_, in S, consisting of as many hyper- 
surfaces as the distinct irreducible factors on the left side of this equation, 
each taken with its multiplicity. The fact that P(x,) =F(%,,0,---,0, 
-,0)) vanishes for but not identically shows that the 
point (x) = (0) is an isolated intersection of Z,_, and £;. It only remains 
to prove that the multiplicity of this intersection is greater than mJN, for 
then our theorem will follow from Lemma 4. Here again we can assume 
that the algebraic cycle Z,-, is an irreducible hypersurface, which means that 
the form * *,;%rs1) and hence also the polynomial 41) 
is irreducible; for otherwise we can apply the same argument to each com- 
ponent hypersurface of Z,_,. According to Lemma 2, it is sufficient to show 
that given any neighborhood of the (x) = (0) in A, (2) there exists 
a linear transformation Te), 


Gi, 


arbitrarily near to the identity, such that T(¢)Z,-, has more than mN regular 
intersections with EH, in R. 


| 
( 
t 
( 
Pp 
tl 
SE 
ir 
at 
al 
Pp 
al 
T 
bo 
th 
Fi 


lar 


ON COMPACT COMPLEX ANALYTIC VARIETIES. 909 


Consider the power series in 2, with coefficients in k(u) = k(uy,* + +, Urs1) 
(U)) (a1 + tr, Ury + frei 0,° -,0)), 


which is convergent for all sufficiently small z, for any given (uw). The 
intersection of T'(¢)Z,-, and E, is then given by the zeros of the power series 
©(x,,(—e)) in C*, and it is well known that a simple zero x, = b, corre- 
sponds to a point (b) = (b,,0,-- +, 0, +, fn(bi, 0,- 0)) 
on H, which is a regular intersection of T(e)Z,_, and L,. Now we have the 
fact that ®(2,, (0)) = P(a,) has a zero of order p > mN at the point 2, = 0. 
Hence, by the Weierstrass Preparation Theorem, there is a neighborhood R’ 
of = 0 and a neighborhood R” of (uw) = (0) such that 


(U)) (U)) (u)), 


where Q(a;, (w)) is a distinguished polynomial of degree » in x, with center 
at x; = 0, (wu) = (0), and g(a, (w)) is a power series which does not vanish 
for x; in R’ and (uv) in Rk”. The distinguished polynomial Q(2,, (w)) might 
be reducible as a polynomial in 2, over the field of all quotients of power 
series in (w), but it cannot have multiple factors. In fact, if any power series 
q(%, (u)) with the property q(0, (0) ) =0 is a multiple factor of Q(2,, (u)), 
then the power series g(u) = q(0, (w)) is a multiple factor of Q(0, (w)) and 
hence also a multiple factor of the polynomial (0, (w)) =F (uw). One 
observes that the power series g(w) is not a constant, for g(w) is not identically 
zero and we have g(0) 0. Hence the equation g(u) —0 defines a finite 
number of analytic elements of r dimensions in A,,,(w) with center at 
(u} = (0). If one of the variables (uv), say u,, is actually involved in the 
polynomial F(w), then d/(u)/du, is also a polynomial in (uw). It follows 
then that both polynomials /’(w) and dF'(u) /du, are divisible by the power 
series g(w). This means that the two positive algebraic cycles of r dimensions 
in A,,,(u) defined by the two equations F(u) = 0 and dF (u) /du, = 0 have 
at least an analytic element of 7 dimensions in common and consequently 
also at least one component variety in common. Since F'(w) is an irreducible 
polynomial, the algebraic cycle F(w) 0 is an irreducible algebraic variety 
and hence must be a component variety of the algebraic cycle dF (uw) /du, = 0. 
This means that the polynomial F'(w) is a factor of the polynomial dF (u) /du,, 
both being now considered as elements of the polynomial ring k[w]. But 
this is impossible, for the polynomial 0/(u)/du, has a lower degree than 
F(u). Thus we have shown that the distinguished polynomial Q(2,, (w)) 
has no multiple factors.’ 


7 The last part of the argument consists essentially in showing that an irreducible 


> 
e 
) 
8 
T 
), 
yf 
), 
is 
T- 
n, 
0, 
he 
ns 
or 
ne 
at 
1) 
OW 
ts 


910 WEI-LIANG CHOW. 


Since Q(2, (w)) has no multiple factors, its discriminant H(w) is not 
identically zero in Rk”. Hence for any point (e) in R” such that H(e) 40, 
the polynomial Q(2,, (e)) has exactly » simple zeros and consequently also 
the power series @(2,, (e)) has exactly » simple zeros in R’. For all points 
(e) sufficiently near to the point (uv) = (0) with H(e) ~0, the w simple 
zeros will lie in any given neighborhood of the point 2,0 and hence the 
corresponding yp» intersection points on £, will lie in any given neighborhood 
of the point (2) — (0). This completes the proof of the theorem. 

Combining Theorem III and Theorem IV, we obtain immediately the 


main theorem: 
THEOREM V. A compact analytic variety in S, 1s an algebraic variety. 


Proof. Let FE, be an analytic simplex contained in the compact algebraic 
variety W,. According to Theorem IIi and Theorem IV, there is an algebraic 
variety V, which also contains the analytic simplex H,. By the principle of 
analytic continuation, any two (irreducible) compact analytic varieties of r 
dimensions which have an analytic simplex #, in common must coincide. 
Hence we have W,—V,, and the theorem is proved. 

It is perhaps not without interest to point out here the analogy between 
Theorem IV and the well known criterion for an algebraic number by means 
of diophantine approximation: A number é is algebraic if and only if there 
exists a positive integer N such that no form 3 agi has a, proper approxi- 

4-0 
mation with respect to the variables 2, 2:,- ~,2n. In other words, if 
Z =max (| then the number is algebraic if and 
only if for each n the diophantine inequality 


| S | << 2-% 
i=0 


has only a finite number of solutions for which the left hand side is not zero. 
Hence, for each n there exists a positive number I, such that the diophantine 


inequality 
| | < 
=0 


has no solution for which the left hand side is not zero. Taking the negative 
of the logarithm of both sides, we can write the inequality as follows: 


polynomial F(u,,---,,,,) cannot have multiple factors in the ring of power series 
k{u,,- -+,%,,,}. This is a special case of a theorem proved by Chevalley, see [2], 
) is the inter- 


p. 11, Theorem 1, which asserts that any prime ideal in K(u,,---,U,,, 
section of prime ideals in k{u,,---,U,,,}- 


( 
t 
a 
i | 
t 
al 
al 
it 
se 
re 
by 
ge 
of 
Spe 
cal 
ha 
bir 
sul 
or 
in 
var 


\w 


ON COMPACT COMPLEX ANALYTIC VARIETIES. 911 


—log | | > N(log Z + log). 
i= 


In the analogy between algebraic number field and algebraic function field 
of one variable we can regard the number é as an “ analytic branch ” and the 


n 
equation > zy‘ 0, for a given (z), as an “algebraic curve.” The number 


i=0 
log Z + logT,, can then be regarded as representing the “degree” of the 
“algebraic curve” and the left hand side of the above inequality as the 
“intersection multiplicity” of the “analytic branch” with the “algebraic 
curve.” Carrying this over to the algebraic function field of one variable, 
we obtain the statement that an analytic branch FZ, in S, is algebraic if and 
only if there is a number N such that no algebraic curve of degree m can 
have an isolated intersection with £, with a multiplicity greater than mJ. 
Thus our Theorem IV can be considered as the extension of this criterion 
to analytic elements in space of higher dimensions. 

Finally, we should like to remark that though Theorem IV is stated 
and proved only for a regular analytic element, it is not difficult to see that 
it holds in fact for any analytic element, whether regular or singular. Thus 
this theorem expresses a property which is both necessary and sufficient for 
any analytic element to be algebraic. Though our proof is partly algebraic 
and partly topological, as is necessary for our present purpose, the theorem 
itself can be expressed in purely algebraic terms, using only formal power 
series. In fact, one could probably obtain a purely algebraic proof of this 
result by means of the intersection theory for algebroid varieties as developed 
by Chevalley.® 


5. Meromorphic transformations. We begin with an almost obvious 
generalization of Theorem V to a multiply projective space. For the sake 
of convenience we shall restrict ourselves to the case of a doubly projective 
space Sin X Sp, though the results hold obviously for the general case. We 
can define an analytic variety W, in Sm X S, in exactly the same way as we 
have done in 2 for S,. Now the space Sm X Sn can be mapped by a bi-regular 
birational transformation onto a non-singular algebraic variety Vmsn in a 
suitably chosen projective space S;. Under this transformation any analytic 
or algebraic variety in Sm X S, is carried into an analytic or algebraic variety 
in Vien respectively, and conversely. Therefore, if W, is a compact analytic 
variety in Sm X Sn, then its image W,* is also a compact analytic variety in 


Chevalley [2]. 


| 
J 


912 WEI-LIANG CHOW. 


Ven and hence also in S;. By Theorem V, W,* is an algebraic variety, hence 
W, is also an algebraic variety. 


THEOREM VI. A compact analytic variety in a multiply projective space 
as an algebraic variety. 


In fact, we can extend this theorem to all the so-called “ extended spaces ” 
of Osgood,® but we shall not go into details here. Instead we shall derive 
from Theorem VI an important result concerning meromorphic transformation 
of an analytic variety. A mapping of an analytic variety W, in S, into the 
space S,, is called a meromorphic transformation if in a sufficiently small 
neighborhood R of any point (a) of W, and for a suitable choice of the affine 
spaces A,(x) and A,,(y) in S, and S,, respectively, the mapping is defined 
by a set of equations 


(9) g(x) yi = fi(z), 


where the functions g(a), f:(x),- - +, fm(x) are holomorphic in R and g(z) 
does not vanish for all points of W,{) R. These equations are to be under- 
stood with the following stipulations: For a point (x) =(p) in W,{)R 
such that g(p) 0, the image under the mapping is the point y; = fi(p)/9(p), 
t=1,---+,m,in S». We shall call such a point (p) a regular point of the 
mapping. For a point (x) = (p) such that g(p) = 0, the image under the 
mapping is to be the set of all points in S, which are limits of the images 
of a sequence of regular points approaching (p). Thus a meromorphic 
transformation, just like the rational transformation in case of an algebraic 
variety, is not a mapping in the strict sense of the word; it assigns only to a 
sufficiently “ general ” point of W, a unique image, while to the others (which 
correspond to the fundamental points of a rational transformation) it assigns 
certain subsets as images. 


THEOREM VII. A meromorphic transformation of a compact analytic 
(hence algebraic) variety W, in S, into a projective space S» is a rational 
transformation of W,, and the image is an algebraic variety in Sm. 


Proof. The graph of the meromorphic transformation in the product 
space S, X Si» is a point set G, the projection of which in S, is the variety 


W,. Let ((a),(b)) be any point of G@, and let R be a sufficiently small | 


neighborhood of ((a),(b)) with R’ and R” as its projections in 8, and Sp 


respectively. The variety W, being algebraic, it is defined in the neighborhood | 


® See Osgood [7], Ch. III, § 32. 


| 
2 
h 
a 
i. 
by 
re 
pe 
a 
t 
Pp 
po 


e 


ON COMPACT COMPLEX ANALYTIC VARIETIES. 913 


R’ of the point (a) by a set of algebraic equations ¢;(7) =0, i1—1,-- -,s. 
Then the set G{) R will be defined by the set of equations 


g(x) yi = fi(z), 1=1,- 


10 
= 0, j=1,---,8, 


with the stipulation that any component variety lying entirely in the 
hypersurface g(x) = 0, if it is an isolated one, should be deleted from 
the set. It is well known * from the theory of power series ideals that 
this can be achieved by the adjunction of a finite number of suitably chosen 
analytic equations to the equations (10), if the neighborhood R is chosen 
sufficiently small. Thus the set Gf] R consists of all the common solutions 
of a set of analytic equations in R. Since ((a), (b)) is any point of G, this 
means that G@ is an analytic variety in S, X Sym. 

It remains to show that @ is a compact set. Let ((a‘), (b)), i=1, 
2,- °°, be a sequence of points in G converging to a point ((a),(”)) in 
8S, X Sm; we have to show that ((a),(b)) is a point of G. Since W, is 
compact and the point (a) is the limit point of the sequence (a‘)), «= 1, 
2,- + +, it follows that (@) is a point of W,. In a sufficiently small neighbor- 
hood of (a) the given meromorphic transformation will be represented by 
a set of equations (9), with the corresponding stipulations. We can assume 
without loss of generality that for all large 7 the points (a‘*)) are regular, 
i.e. g(a‘) 0. For, since each point ((a")), (b)), with =0 is 
by our stipulation the limit of a sequence of regular points, we can always 
replace each point ((a‘)), (b), for sufficiently large i, by a regular point 
sufficiently near to it so that the resulting sequence will have the same limit 
point ((a),(b)). Now, if g(a) #0, then we have 

bj = lim bj? = lim )/g(a) = fj(a)/g(a), +, m; 

and hence ((a), (0)) is a point of G. On the other hand, if g(a) — 0, then 
the point (b) is by our stipulation a point of G. Therefore G is a compact 
point set. 

Thus we have shown that G is a compact analytic variety in S, X Sm; 
hence, by Theorem VI, @ is an algebraic variety. It follows then that the 
projection of G in S,, is also an algebraic variety W* and the given mero- 
morphic transformation is really an (irreducible) algebraic correspondence 
between W, and W*. Since this algebraic correspondence assigns to a generic 
point of W, a unique image point in W* (and since our ground field has 


10 See Riickert [§], or Bochner and Martin [1], Ch. X. 


e 
n 
e 
ll 
e 
d 
) 
); 
le 
e 
ic 
a 
h 
18 
iC 
l 
ni 


914 WEI-LIANG CHOW. 


the characteristic 0), therefore it is a rational transformation of W, onto W*, 
Thus the proof is completed. 

The concept of a meromorphic transformation evidently includes that 
of a meromorphic function as the special case m1. Hence we have the 
corollary : 


CoROLLARY. An everywhere meromorphic function on a compact analytic 
variety is a rational function. 


So far as we know, even this corollary has been proved up to now only 
for a few special cases such as the projective space and the space of analysis 
(the product of projective lines). 


THE JOHNs HOPKINS UNIVERSITY 


REFERENCES. 


1. S. Bochner and W. T. Martin, Several Complex Variables, Princeton, 1948. 

2. C. Chevalley, “Intersections of algebraic and algebroid varieties,’ Transactions of 
the American Mathematical Society, vol. 57 (1945), pp. 1-85. 

3. K. Knopp and R. Schmidt, “ Funktionaldeterminanten und Abhaengigkeit von Funk- 
tionen,” Mathematische Zeitschrift, vol. 25 (1926), pp. 373-381. 

4. B. O. Koopman and A. B. Brown, “On the covering of analytic loci by complexes,” 
Transactions of the American Mathematical Society, vol. 34 (1932), pp. 
231-251. 

S. Lefschetz, Topology, New York, 1930. 

and J. H. C. Whitehead, “On analytical complexes,” Transactions of the 

American Mathematical Society, vol. 35 (1933), pp. 510-517. 
7. W. F. Osgood, Lehrbuch der Funktionentheorie, Bd. II, 1. Leipzig, 1929. 
8. W. Riickert, “Zum Eliminationsproblem der Potenzreihenideale,” Mathematische 
Annalen, vol. 107 (1932), pp. 259-281. 
9. B. L. van der Waerden, “ Topologische Begriindung des Kalkiils der abzaihlenden 
Geometrie,” Mathematische Annalen, vol. 102 (1929), pp. 337-362. 


cc 
(. 
is 
fo 
. ( 
(2 
de 
de 
de 
N = 
(3 
TI 
pa: 
on 
ass 
or 
of 
sim 
[1] 
can 


A CHARACTERIZATION OF THE SPECTRA OF ONE. 
DIMENSIONAL WAVE EQUATIONS.* 


By HARTMAN. 


1. Let a positive p(t) and a real-valued q(t), where Ot < o, be 
continuous functions with the property that the differential equation 


(1) (pa’)’ + 


is in the Grenzpunktfall. This means ([4], p. 238) that, for some (and then 
for every) A, (1) possesses a solution which is not of class L?(0, 0). Thus 
(1) and a boundary condition 


(2) 2(0) cosa+a’(0) sina=—0 (0OSa<7) 


determine a boundary value problem in the Hilbert space Z7(0, 0). Let S(a«) 
denote the spectrum of this eigenvalue problem (1), (2). 

Let x(t, A) 0 be a (real valued) solution of (1), (2) and let N(T, A) 
denote the number of its zeros on OSt<T. Finally, for ’ < Xr”, let 
n=n(d’,rX”) S denote the limit inferior, as T — «, of the difference 


(3) — N(T,). 
The following theorem will be proved: 


(1) There are exactly n points » of S(a) satisfying A” or 
VSA<AX” according as (1) is osciliatory or non-oscillatory for X="; in 
particular, n= co if and only if an infinite set of points r of S(a) satisfies 


What is essential and new in (I) is the fact that no restriction is placed 
on (1), except that it be in the Grenzpunktfall. In particular, it is not 
assumed that (1) is non-oscillatory for A on A’ = A S A” as in earlier theorems, 
or even that (1) is non-oscillatory for some A. Actually, the oscillatory case 
of (1) has always been the more complicated in view of the comparatively 
simple structure of the solutions of (1) in the non-oscillatory case; cf., e. g., 
[1], pp. 701-703. For this reason, the proofs for the non-oscillatory cases 
cannot be adapted to (I). Oscillation theorems of the type (1) go back to 


* Received May 5, 1949. 


q 915 


916 PHILIP HARTMAN. 


Weyl [4], pp. 251-257, where it is assumed that (1) is non-oscillatory for 
and that g(t) = const. The author has shown [1] that Weyl’s 
method can be modified so as to apply to the general non-oscillatory case with- 
out any assumption on q(t), except that (1) be in the Grenzpunktfall. 

The proof of (I) is similar to those recently used in [3] in connection 
with separation theorems for the spectra of Hermitian matrices and their sec- 
tions; in particular, see the last part of (ili) in the Appendix and the proof 
of theorem ‘*) in [3]. The assertion (1) and its proof will show and will 
depend ou the fact that the spectrum S(«) can be obtained as the “limit” 
of the spectra Sr(«) of Sturm-Liouville boundary value problems on0 =? ST, 
as T—» o. This answers a question raised by Wintner regarding the defini- 
tion and validity of the limit process Sr(a#) > S(«) as T— o. 

Let wi(T) denote the set Sr(a) of eigenvalues of the 
Sturm-Liouville boundary value problem on 0 = determined by (1), 
the boundary conditions (2) at ¢—0, and 


(4) =0 
at T,, 


(II) A number belongs io S(«) if and only if d(T) as T> 
where d(T) =min|A—p,(T)| for =0,1,2,---. 


The point in (II) is that in order that A be in S(a), it is not sufficient 
that d(T) —0 be true merely on a sequence of T-values, which tend to o. 


characterization of points of 


The statement of (II) gives an “ elementary 
the spectrum S(«) of (1), (2). 

Let « > 0 and T —T, be so large that, for some value of A, d(T.) = d(T,A) 
satisfies | d(7’)| S« whenever T > T,. Then there is at least one eigenvalue 
of Sr(a), say pa(T), in [A—erA+€c] for T>T,. Actually, for some arbi- 
trarily large values of J, there are at least two points pa(7’), pas(t) im 
[A—e,A +] in the oscillatory case. For consider a T-value T = T”’ > T, 
for which 2(7’,A—e) =0, so that A—e==y»,(T’) for some h. Then 
SA «, for otherwise for some values of T near but greater than T, 
it follows that d(T) since <A—e and pau(T) This 
fact (involving pa, wns.) shows that assertion (11), above, is an analogue of 
the theorem (iii) in the Appendix of [3]. . 

The boundary condition (4) in (iI) can be replaced by any fixed homo- 
geneous boundary condition 


(4 bis) «(T) cos 8 + 2’(T) sin B =0, (0=B<7), 


(J 


sa 


al 
t] 
be 
0s 
¢2 
T 
he 
pe 
to 
n 
48 
fr 
(i 
ir 
0) 
L 
of 
= 
a 
kr 
to 
fo 
se 
si} 
If 
fo 


A CHARACTERIZATION OF THE SPECTRA. 917 


at tT. In fact, (1) can be modified as follows: For A’ < A”, let m denote 
the limit inferior, as T — o, of the number of eigenvalues po(T), wi(T),° °° 
belonging to the boundary value problem (1), (2), (4 bis) on the interval 
VSALN’, then (I) remains true if “m ” is read in place of “ n.” 

It may be remarked that the difference between the oscillatory and non- 
oscillatory cases of (I) is partially explained by the fact that in the latter 
case n= n(A’,rA”) is not only a “limit inferior,” but actually a “ limit.” 
Thus if (1) is non-oscillatory for A =A”, then n(X’, A) + nd, A”) = n(X’, X”) 
holds for A’ < A < X”, as a consequence of {N(T, A”) — N(T,A} + {N(T, A) 
— = N(T, 2”) — N(T, 2’); in particular, if A =X” is an isolated 
point of S(a), then n(rA” — = 0 and n(dA”, 4+ = 1, while 
u(r” —e, dX” +-€) =1 for small e >0. On the other hand, if (1) is oscilla- 
tory for A=” and A=” is an isolated point of S(a), then (1) implies 
n(r” —e,dX”) =0 and n(rX",A” + €) =0, but + €) =1 for 
small « > 0, so that the interval function n = n(Q’,X’”) is not additive. 


(1) implies the following analogue of a theorem, known for the case when 
(1) is non-oscillatory for A= A”; ef. [1]. 


(III) Ifn< put N(A) for’ <A< A”. Then (i) N(A) 
is a non-decreasing (integral-valued) step function; (ii) N(A) ts continuous 
from the left; (iii) every jump (if any) of N(A) has the value 1; finally, 
(iv) the discontinuity points of N(A) on <A <A” are in the point spec- 
irum and no other A-values satisfying X <A < are in the spectrum S(«) 
of (1), (2). 


2. Proof of (1). Associated with (1), (2) is a self-adjoint operator 
L(z) = (p2’)’ + qz defined on the set of functions 2(1) for which z, pz’ are 
of class C’ on OS < ~; and z, L(z)are of class L*(0, ©); finally, 
satisfies (2) at £0. The set S(a) of A-values is the spectrum of this self- 
adjoint operator. 

In order to avoid the consideration of different cases and duplication of 
known theorems, the proof of (1) will be given in the case that (1) is oscilla- 
tory for A =A’. For the case that (1) is non-oscillatory for A =X”, see [1]; 
for the case that (1) is non-oscillatory for A=’ but oscillatory for A= A", 
see [2]. (Actually, the arguments below with slight modifications give 
simpler proofs for these cases also.) 

It will first be shown that S(«) contains at least n points of ’ <A < A”. 
If n < o, it is possible to choose 7 so large that (3) has the value n and that 
t= a(t, dr’) satisfies (4), since (3) is not increasing at T when (4) holds 
for «= x(t, d’). If a number TZ can be chosen so large that the value 


918 PHILIP HARTMAN. 


of (3), which will be called n for the moment, exceeds any given number and 
that z= -2(t,d’) satisfies (4). It may also be supposed that z= -2(t, d”) 
satisfies (4), otherwise \” can be replaced by a smaller value of A, say A = 2”, 
for which (4) holds for z =2(t, A”) and (3) has the value n. 


Let po(T) < - denote the eigenvalues of the Sturm-Liouville 
boundary value problem on 0 tT determined by (1), (2), (4), and let 
2;(t), where j7 —0,1,- - -, denote a corresponding set of eigenfunctions 


on OStST. Since r—2(t,d’) and «-—2(t,r”) satisfy (2), (4), and 
since (3) has the value n, it follows that \’ = wa and A” = pain for some h, 
Put 


(5) z(t) or z(t) 


according asO [t= T or T<t < where +, Cn are arbitrary con- 
stants. Av appropriate choice of these constants will be made below. 
Suppose, if possible, that only points of A’ <A < are contained in 
S(a), and thet k<n. Let A. < As < Ax denote these & values (if 
Ik >0). These »-values are in the point spectrum of S(a). The equations 


T 

(6) f f x(t, Aj)2(t)dt 

0 0 
for j= 1,2,---,k, and the equation 
(7) z'(T —0) =0 
represent at most k + 1 homogeneous Linear equations for the n + 1 constants 
Co: C1,° °°, Cn. Since +1, these equations possess a non-trivial 
solution. In (5), let ¢,¢,,° * -,¢a denote such a solution of (6), (7), so 


that z(t) #0. 

Since x—2;(t) satisfies (4), it follows from (1), for Ay, that 
(pa;’)’ =0 at tT. Hence (5) and (7) show that z= 2 = (pz’)’ =0 at 
and that z and pz’ are of class C’ for 0 Also, the second of 
the conditions (5) shows that z, L(z) are of class L?(0, 0). Finally, sz 
satisfies (2) at £0, since =<; does. Hence, z= 2(t) is in the domain of 
definition of the self-adjoint operator L(z). From (1) and (5), it is seen that 


for OStST and that L(z) + for T<t< oo. Since 
+”) |S for =0,1,- --,n, it follows from (5) 


that 


ha 


sin 


A 
h 
fr 
h 
() 
( 


n- 


ce 
5) 


A CHARACTERIZATION OF THE SPECTRA. 


in view of the orthogonality of the eigenfunctions Zn, Tain OD 

However, since (6) means that z(t) is orthogonal to the finite set of 
eigenfunctions belonging to eigenvaJues of L(z) within a distance $(A” —)’) 
of the point $(A’ + »”), that is, from within »’ < A < A”, it follows from the 
Parseval identity belonging to the spectral resolution of Z(z) that the sign 
of inequality cannot hold in (8). For the same reason, the sign of equality 
can hold if and only if at least one of the numbers A’, X” is an eigenvalue and 
z(t) is a linear combination of the corresponding functions x(t, ’), z(t, r”). 
But this is impossible by the last part of (5). Thus the assumption that 
V<A< A” contains only / < n points of S(a@) leads to a contradiction. This 
proves that at least n points of arein’ <A < Xr”. 

Let n< o. It will be shown that at most n points of S(a) are in 
V<rA<A”. It is clear that if T is large, (3) has the value n when (4) 
holds for a= Consequently, there exists an h =h(T) such that 
pa(T) < paa(T) < pan(T) SA” < Hence, (4) 
holds for z = a(t, A’), the interval X’ <A =A” contains exactly n eigenvalues 
* OL the Sturm-Liouville boundary value problem (1), (2), (4). 

Choose a sequence of 7-values, 7, -, which tend to o in such a 
way that, as i— oo, lim exist for j= 1,2,---,n. It will be 
shown that the points of S(a#) in X<A<X” are contained in the set 
of (at most n) points w',- --,u". Let A be a point in AX” <A <A” distinct 
from pt,: Let d > 0 be so small that 


min (A”—A, A—A’, | (T)|) >d > 0 


holds for all large T - T; and for j}=1,---+,n. Finally, let g = g(t), where 
0<t < o, be an arbitrary continuous function of class L°(0, 0). 
The last set of inequalities implies that if T = T; is sufficiently large, then 


(9) (px’)’ + 


has a solution «= <r(t) satisfying (2), (4); furthermore, 


T 
f af f g°dt, 
0 0 


since the distance from \ to the nearest of the eigenvalues po(T),pi(7'),° °° 


12 


919 
1d 
| 
0 
le 
et 
ns 
1d 
h. 
in 
if 
ns 
ts 
al 
so 

at 
at 
of 
at 
|_| 


920 PHILIP HARTMAN. 


exceeds d. Clearly, the sequence T,, T.,- - - contains a subsequence R,, R,,- - - 
such that, as ~«, lim = «(t), where R = Rj, exists (uniformly on 
every finite interval O=¢=T7). Hence x(t) is a solution of (9) satisfying 
(2) and 

0 0 
Since g is an arbitrary continuous function of class L?(0, o), it follows that 
d does not belong to S(x); [4], p. 251. This completes the proof of (I). 


THE JOHNS HOPKINS UNIVERSITY. 


REFERENCES. 


[1] P. Hartman, “ Differential equations with non-oscillatory eigenfunctions,” Duke 
Mathematical Journal, vol. 15 (1948), pp. 697-709. 


[2] and C. R. Putnam, “ The least cluster point of the spectrum of boundary 
value problems,” American Journal of Mathematics, vol. 70 (1948), pp. 
849-855. 

[3] and A. Wintner, “Separation theorems for bounded Hermitian forms,” 


ibid., vol. 71 (1949), pp. 865-878. ‘ 

[4] H. Weyl, “ Ueber gewéhnliches Differentialgleichungen mit Singularititen und die 
zugehérigen Entwicklungen willkiirlichen Funktionen,’ Mathematische An- 
nalen, vol. 68 (1910), pp. 220-269. 


THE JOHNS HOPKINS UNIVERSITY. 


arl 
of 

ap] 
Tes 
ma 
spa 
typ 
fou 


defi 
real 
q(x 
q(x 

| mar 
(1. 
whe 
in A 
Was 

H. V 
expre 


t 


THE EIGENVALUE PROBLEM FOR ORDINARY DIFFERENTIAL 
EQUATIONS OF THE SECOND ORDER AND HEISENBERG’S 
THEORY OF S-MATRICES.* 


By KoparRa. 


Recently E. C. Titchmarsh has treated the theory of expansion of 
arbitrary functions in terms of the eigenfunctions of a differential operator 
of the second order by a new method and obtained results of importance for 
applications.‘ The method of Titchmarsh is based solely on the calculus of 
residues. In the present paper we shall first give another proof of Titch- 
marsh’s results based on the general theory of linear operators in Hilbert 
space and, secondly, applying them to differential equations of Schrédinger 
type, show that a theorem of Heisenberg? concerning the S-matrix can be 
founded on these results. 


1. Spectral theorem. Let us consider the differential expression: 
L{u] —— + w, 
defined in a (finite or infinite) open interval (a,b), where p(x), q(x) are 
real-valued functions defined in (a,b), p(x) has continuous first derivative, 
q(x) is continuous, and p(x) > 0 fora< for or 3, p(z), 


q(z) may behave arbitrarily, e.g. increase infinitely, oscillate infinitely 
many times, etc. 


Classification. Consider the differential equation 
(1. 1) L[u] =1- 4, 


where 7 means a complex parameter. 


* Received May 12, 1949. 

Titchmarsh [13]. The results of the present paper were obtained by the author 
in August, 1948 independently of Titchmarsh’s work and other recent literature which 
was inaccessible in Japan. The paper was revised following the suggestion of Professor 
H. Weyl, who kindly informed the author of the literature. The author wishes to 
express here his best thanks to Professor Weyl. 

* Heisenberg [7]. See also Ma [9], Jost [8]. 


921 


4 
= 


KUNIHIKO KODAIRA. 


THEOREM 1.1. Choose a fixed point c,a<ec <b, arbitrarily. If every 
solution u of L[u] =I): u is square summable in (a,c] (or [c,b)) for some 
Io, then, for arbitrary 1, every solution u of Llu] =1-u is also square 
summable in (a,c] (or [e,b)).8 


Since the solution wu is square summable in (a,c] (or [c,6)) or not 
independently of the choice of c, we can classify, by virtue of Theorem 1.1, 
the differential expressions LZ in two types with respect to a (or b): if every 
solution u of L[w] —1-w is square summable in (a,c] (or [c,b)), L is said 
to be of the l.c. type (limit circle type) at a (or 6); otherwise ZL is said to 
be of the 1. p. type (limit point type) at a (or 6). Thus there exist the 
following four cases: 


I. JL is of the 1. p. type at both a and b. 

II. ZL is of the l.c. type at a and of the 1. p. type at b. 
Il’. L is of the 1. p. type at a and of the 1. c. type at b. 
Ili. JL is of the l.c. type at both a and D.* 


In what follows the case FI’ will be omitted, since this case can be reduced 
to the case II by the transformation: «— — 2. 


Bracket. For arbitrary functions u, v, we introduce the bracket: 


[wo] (2) = p(x) [u(w) 
(w= du/dx, v’ = dv/dz). 


In case u and v satisfy one and the same equation L[u]l —J-u, we write 
[uv] for [wv](x), since, in this case, [wv](a) does not depend on z. 


Fundamental solutions. By a system of fundamental solutions we shall 
mean the system of two solutions s, (2,1), s.(2,1) of the equation L[u] =1-u 
having the following properties: 


(i) [ssi] =1, 
(1. 2) 4 ii) (a, 1) = (2, l), (& = 1, 2), 
iii) as functions of 1, and (d/dx)s,(a, l) 


(k =1,2) are regular analytic in the whole l-plane. 


% Weyl [14], Theorem 5, p. 238. 
*Weyl [14], [15]. 
5 The bar means the conjugate complex number. 


et 


anc 


Ue 


922 
Wl 
all 
fu 
In 
th 
C0) 
| 
In 
ad) 
exis 
cont 
(1. 4 
in St 


EIGENVALUE PROBLEMS AND S-MATRICES. 923 


Such a system of solutions s;, s. is obtained, for example, by solving the 
equation L[w] under the boundary conditions: 

=0, S2(c) = p(c)s’1(c) = 1, (a<c<b), 
where ¢ is an arbitrary fixed point in (a,b). 

The differential operator. Denote by § the Hilbert space consisting of 
all square summable functions defined in (a,b); the inner product of the 
functions u, v in § will be denoted by (u,v), and the norm of u by || wu]. 
In order to consider L as a linear operator on §, we have to make explicit 
the domain of L. For that purpose, we introduce the subspace D of § 
consisting of all functions wu having the following properties: 

i) weS, 

ii) w is differentiable in the open interval (a,b), 

iii) du/dz is absolutely continuous in every closed subinterval 

[a,, bi: ](a<a, <b, <b) of (a,b), 

iv) Llu]e§, 


and consider L as a linear operator having D as its domain. Then ZL becomes 
a closed operator,® which will be denoted by T; i.e. we put 


for ue®D. 


In the case 1, T 1s self-adjoint; while, in the cases II and III, T is not self- 
adjoint." 


Boundary conditions. Suppose L to be of the l.c. type at a. Then, if 
we D, the limit: 


[wu](a) = [wu] (2) 


exists for an arbitrary function w(a) defined in (a,c](a<c<b) having 
continuous second derivatives such that 


c 
(1.3) f +o, J | L[w]|*dz< + o. 
a a 
Fixing such a function w, we take therefore 
(1. 4) [wu](a) =0 
® Stone [11], Theorem 10. 11. 


7 Weyl [14], Chap. IIIT and Chap. II, respectively. These statements can be found 
in Stone [11], Theorem 10. 12 in the terminology employed here. 


924 KUNIHIKO KODAIRA. 


as the boundary condition at a. The condition (1.4) will be called trivial, 
if every we®D satisfies (1.4). From the identity: 


(1. 5) [wu] (a) = [ws,] (a) [s2u] (a) — [ws2] (a) (2) 
follows that the boundary condition (1.4) is not trivial if and only if 
(1.6) |[ws, (2) ](a)| + 40 


for fixed 1, where s,(1) mean the fundamental solutions mentioned 


above. 
Now, in the case I, we put 


H=T. ? 
since T' is self-adjoint. In the case II, we take a real valued function w,(z) 
satisfying (1.3) and (1.6), and restrict the domain D of 7 by imposing 


the non-trivial boundary condition [wau](a) 0. Then we get from T a 
self-adjoint operator,® which will be denoted by H; i.e. we put 


(1. 71) Hu=L{u], for we D, [wau](a) =0. 
Finally, in the case III, we take, besides the function wa(z) mentioned above, 
the real valued function w,(z) in [c,b) satisfying the conditions at } 


corresponding to (1.3) and (1.6), and restrict the domain D of T by 
imposing the boundary conditions: : 


[wau](a) = (bd) = 0. 
Then we obtain from T a self-adjoint operator,® which will be denoted by H; 
i.e. we put 
(1. Hu=L{[u], for we D, [wau](a) = (bd) =0. 


Our purpose is to determine the spectra of the self-adjoint operator H thus 
defined. 


Characteristic functions. Let a self-adjoint operator H defined as above 
be given. Then we put 


(1. 8p) fo(l) =— lim s,(z, 1) /s, (2, 1) 


if L is of the 1. p. type at b, and 


® Weyl [14], Chap. II. Cf. also Stone [11], Theorem 10. 13. 
® Weyl [15], pp. 241-242. Cf. also Stone [11], Theorem 10. 14. 


as a 


if 

| 
if 
| is 

| de 

or 
Fr 

(1. 

(1. 

sin 

fun 

l. c. 
F(1) 


EIGENVALUE PROBLEMS AND S-MATRICES. 


(1. 9») fo(1) =— {[wose(2) 1(b) }/{ wos: (1) 1 (6) } 
if L is of the l.c. type at b; similarly we put : 
(1. 82) fa(l) = —lims,(z,1)/s,(a,1) . 

if L is of the l. p. type at a, and / 

(1. 92) fa(1) = — {[wase (1) ] (a) }/{ (2) ](a)} 


if ZL is of the l.c. type at a. The functions f,(/) and f.(l) thus defined 
will be called the characteristic functions ?® of the operator H. In case L 
is of the 1. p. type at b (or a), the function f,(/) (or fa(l)) is uniquely 
determined by the condition * 


(1. 10) fi S2(x, 1) + +0, (ax<c<b) 


or 


(1. 10.) S2(x, 1) + fa(1) 81(2,1)|* < + (ax<c<b). 


From (1.2), ii) follows immediately 


(1. 11) =fa(l), = 
Again we have 
(1. 12) fa(l) Afo(t), (for 310), 


since H is self-adjoint. 


THEOREM 1.2. The characteristic functions fa(l) and f,(1) are analytic 
functions of 1 which are meromorphic in 310. Especially, if L is of the 
l.c. type at a (or b),. then fa(l) (or fo(l)) is meromorphic also on the real 
axis. 


Spectral theorem. Let H be a self-adjoint differential operator defined 
as above and 


( 


10 Titchmarsh, Chap. III. Our theory is parallel to that of Jacobi matrices. See, 
e.g., Stone [11], Chap. X, § 4. The characteristic functions correspond to the function 
F(l) in Stone [11], p. 560. 
11 Weyl [14], pp. 227. Cf. also Stone [11], Theorems 10. 12, 10. 14, 10. 20. 

12 For the proof of this theorem, see Weyl [16]. 


925 
| 
= 


926 KUNIHIKO KODAIRA. 


be the spectral representation of H. Introduce the “characteristic matriz” 


M(1l) = (Mj,.(1)) defined by 


( Mu(l) = fa(l) fo(2) — 
(1. 13) = = [fa(l) + [fa(?) — fo(2) 
= [fa(1) — fo) I", 


where fa(/), f,(/) are the characteristic functions of H. It follows from 
(1.11), (1.12) and Theorem 1.2 that Mj,(1) is an analytic function of | 
which is regular in 31 #0 and 
(1. 14) My(l) = My (1). 

THEOREM 1.13. (SpecTRAL THEOREM).'* For every real number 
there exists the limit 


(1. 15) pjx(A) = lim lim 1/7 f SSM jn (A + te) dar. 


5>+0 €>+0 


As «@ function of A, the matrix function P(A) = (pjx(A)) ts continuous on 
the right and monotone non-decreasing in the sense that, for »< dA, the 
symmetric matrix P(A) —P(p) is positive semi-definite. Put, for every 
finite interval A= (p, A], 


= E(A) — E(u). 


Then, for every ue $, H(A)u(x) can be represented as follows: 


(1.16) E(A)u(r) = funy A) A) dpjx(A), 


a 
where 


b 
f dy | > << + 


b 
- and the integral f dy in (1.16) converges absolutely. u(x) ttself is repre- 


sented as follows: 
b N 
jk 


where the limit converges in the mean; especially if u belongs to the domain 
of H, we have 


b 


18 Cf. Titchmarsh [13], Chap. III. 


Ft 
ar 
th 
by 
Or 
sol 
la 
M, 
4 
wh 
vel 
firs 
obt 
(i. 
whe 
|. 
(1. 
but 


EIGENVALUE PROBLEMS AND S-MATRICES. 


For every v> 0, the residual terms 
(1.19) Ry (1) = — (A= 


are regular in the I-plane except for real 1 such that 1S —v or lZv; thus 
the singularities of Mj,(l) in the finite l-plane are completely determined 
by pix(A). Especially every isolated singular point of Mj,(l) is a pole of the 
order 1. Incidentally, if we take as s,, 82 the special system of fundamental 
solutions s,°, 82° satisfying 


(1.20) s,°(c) = s8,”(c) =0, 8°(c) = p(c)si%(c) 1, (a<c< 


Mjx(l) can be represented as follows: 


where ly means an arbitrary fixed point with 1,40 and the integral con- 
verges absolutely. 


The existence of the matrix pj,(A) so that (1.16), (1.17) hold was 
first proved by H. Weyl**; another proof based on the modern theory of 
Hilbert space was given by M. H. Stone.*® The formula (1.15) was 
obtained by E. C. Titchmarsh." 


(1.15) will be called the spectral formula; it can be represented also 
in the following form: 


(1. 22) piz(A) = — Lim lim (221)-1 M;j,(1) dl, 
C(u,a,€) 


where C'(y, #,€) means the contour consisting of two oriented polygonal lines 
whose vertices, in order, are w+ te, w+ ta, ia, te, and —t, —ta, w—te, 
ie, respectively, the real number being subject to the inequality « >. 


Generalized Fourier Transformation. From the formal aspect, the 
formula (1.17) can be rewritten in the following form: 


+ 0 b 
(1.23) u(x) = a)u(y)ay, 
a 
but the integrals in (1.23) do not necessarily converge. In order to avoid 


1 Weyl [14], [15]. 
15 Stone [11], Theorem 10. 22. 
16 Titchmarsh [13], Chap. III, formulae (3.1.5), (3.1.6), (3.1.7), (3.1.8). 


927 
| 


928 KUNIHIKO KODAIRA. 


this difficulty, we consider A-measurable vector functions (A) = (¢,(A), $2(A)) 
and put 


J 


Since the matrix dP(A) is positive semi-definite, we have || ¢ ||*=0, and 
= {¢| || ¢ || < + ©} constitutes a Hilbert space. Now, for every uc §, 
the integral 


b 

(1. 24) — 

converges in the sense of the norm || ¢ || in $%*, te. 


* 
and, by means of ¢;(A) defined by (1. 24), u is represented as follows: 


(1. 25) u(z) = 


where the integral converges in the mean; thus the formula (1.23) 1s valid 
in this sense. We have furthermore 


(1. 26) 


Conversely, for every ¢ & $*, the integral (1.25) converges in the mean, and, 
by means of « given by (1.25), ¢ can be represented in the form (1. 24). 
Also, if wu belongs to the domain of H, we have 


(1.27) Hu(1) Adon (a). 
Thus we obtain 


THEOREM 1.4. (GENERALIZED FourIER TRANSFORMATION THEOREM). 
The transformation 


b 
u(x) > dx (A) = f (2, A) u(x) dx 
is a unitary transformation mapping $ on *, whose inverse is given by 
Jk 


The formula (1.27) shows that the unitury transformation u— ¢ trans- 
forms the operator H to the “ diagonal form”: XX. 


col 


we 


Th 


( 
| 


EIGENVALUE PROBLEMS AND S-MATRICES. 929 


2. Proof of the spectral theorem.’’ First we shall prove the spectral 
theorem in the special case that the fundamental solutions s,, s, satisfy the 
conditions 


(2. 1) 8:(c) =s’,(c) =0, S.(c) =p(c)si(c)=—1, (a<c<b). 


For every J with 3/=>40, we introduce the “Green’s function” G(z, y, 1) 
defined by 


G(x, y, 1) = @(y, 2,1) = ga(t, 1) go(y,1)/ {fa(l) —fo(1)}, (Sy), 


where 
Ja(2, = + fa(l)s:(z, l), 
= S2(z, + fo(l)s: (2, 


and put, for arbitrary ve §, 

G(1)v(a) = f° 1)v(y)ay, 
Then G(l) is a bounded linear operator ** and 
(2. 2) G(l) = 


Now we put, for arbitrary we §, 
u(x, = [£(A) — Ju(z), 
u(x, 4) = u(x,r) —u(z,p) = L(A)u(z), (A= (u,A]); 


considering u(z,A) as an element of 9, we use the abbreviation u(A) for 
u(z,A). For every we §, u(A) belongs to the domain of H. Putting 


9(A) = (H = J, (A—1)dB(a)u, 


we have therefore, by (2.2), w(A) = G(l)v(A). Hence we have 


f D) = (uf 50). 
Thus we obtain the formulae 


17 We follow the method of H. Weyl. Cf. Weyl [14], pp. 238-251. 
18 Weyl [14], pp. 224-231. Cf. also Stone [11], Theorem 10.19, 10.20 and 10. 21. 


| 
| 


930 KUNIHIKO KODAIRA. 


(2.3) (A—1)dE(a)G(a, ,1)), 


(2.4) du(a,A)/dx = (u, (A—1)dE(A)G2(2, 51)),  (Ge=dG/dz). 


From these formulae follows that the functions u(2, A), (d/dx)u(a, A) are 
continuous with respect to z and of bounded variation with respect to d in 
every finite interval [w,A]. Considering w(A) as an element of §, we have 
the relation 


L[u(A)] —Hu(a) = f, AdE(a)u(A), 


showing that the function u(z,A) satisfies the integro-differential equation: 


(2. 5) L[u(a,r)] = f Adu(a, r) ; 
0 
u(x, 2) satisfies furthermore the boundary conditions: 
u(z,0) = (d/dr)u(a, 0) =0. 


Under these circumstances, the solution u(z,A) of the equation (2.5) is 
given by 


(2.6) u(z,rA) = f {si (x, A) du,(A) + s2(a, A) dus(A) }, 


where : 
(2. 7) u,(A) = p(c)u’(c,d), U2(A) =u(c,d), (w’ = du/dz).¥ 


In what follows we assume that the additive functions u,(A), &(A), 
pix(A), ete. of the interval A are always combined with the corresponding 
functions u;,(A), pix(A), ete. by the relations as follows: 


up(A) = we(A) = (a, AI). 
Now, putting y:(/) p(c)Gz(c, 1), yo(l) = G(c,l) and 
(2.8) f(a) = (k = 1,2), 


we get from (2.3), (2.4) and (2.7) 
(2. 9) (A) = (u,&(A)), (k = 1,2). 


For fixed A, uz (A) can be considered as linear functionals of u, not depending 


19 Weyl [14], Hilfssatz 1 and its proof, pp. 240-241. 


The 


whi 


i 
Ins 
wh 
(2. 
Ins 
(2. 
whe 
ma’ 
Usi 
han 
whi 
put 
= 


EIGENVALUE PROBLEMS AND S-MATRICES. 931 


on 1. Hence we infer, by (2.9), that the functions & (A) do not depend on 1 I. 
Insert now u= (A), (A= (0,A]) in (2.6). Then we have 


(2.10) = A)dpj(A) + A) dpye(A)}, 
where 


(2. 11) pix(A) = (€)(A), & (A)). 
Inserting (2.8) in (2.11), we get 


| A—1| 2(dE(A)y4(1), 
whence we conclude 


where the integral converges absolutely. From (2.11) follows that the 
matrix P(A) = (pj.(A)) is positive semi-definite and independent on 1. 
Using the explicit expression of G(a, y,1), we can readily calculate the right 
hand side of (2.12). As a result, we obtain 


ve (1) ) = SM ix (1), 


and therefore the formula 


(2. 13) fi A—1| done (A) = 3M 


which yields immediately the spectral formula (1.15). To prove (1.16), 
we choose veal numbers Ao, * An So that 


put 6 = max | Am —Am-1 |. and consider the sum 
™m™ 
S5(y) Zz 8; (2, Am) Am); (Am (Am-1, ). 
m j 


Then we have, by virtue of (2.10), 


lim S35 A)sx(y, A) dpjx(A), 


while, since || €;(An) || = || €(A) || 2, there exists lim Ss in the sense of 
m 


| || and thus lim Sse. This proves the inequality 


a A j,k 


i 
} 


932 KUNIHIKO KODAIRA. 


Again we have, by (2.6) and (2.9), 
= lim 3 Am) U(2, Am) 
6-0 j 
proving (1.16). (1.17) and (1.18) follow immediately from (1. 16). 
Again, (1.21) follows from (2.13). Thus the spectral theorem has been 
proved for the special case mentioned above. 

To prove the spectral theorem in general cases, we denote the special 
fundamental solutions s,, and the corresponding pj, mentioned above 
by 51°, 82°, Mjx°, pjx°, respectively. Then general s,, s. are related to s,°, s,° 
by a unimodular transformation: 


(2. 14) l) = Bu l), det = 1, 


where are holomorphic functions of and Bjx,(l) = By the 
transformation (2.14), the characteristic matrix is transformed according 
to the rule: 


Mjx(1) z Bmj(1) (1) Momn (1). 
This shows, combined with the relation: 


M°mn (1) = ft (A —1)-*— (A— 1p) *} dp°mn(A) ++ const., 
that 


(2. 15) — = (Aa— 1)“*Bmj(X) Bx (A) dp°mn (A) 


Mm 


are regular except for real 7 such that 1 = —v or 1=v; hence the functions 
piz(A) defined by (1.15) are given by 


r 


Inserting this in (2.15), we see immediately that Rj,” (1) are regular except 
for real 7 such that 1=—v or 1=v. Again, from (2.14) and (2. 16) 
follows that the relations (1.16), (1.17) and (1.18) are preserved by the 
transformation s;° —> sx, pjx° —> pjx. Thus the spectral theorem is completely 


proved. 


3. Hill’s equation. As an example, we consider 


L—=— d’/dx’ + q(z), +1) +), 


| t 
| 
| 
(3 
‘ 
Si 
| 
( 
< 
and 
4 mars 
[20], 


t 


EIGENVALUE PROBLEMS AND. S-MATRICES. 933 


where g(2) is a continuous periodic function with the period 1.2° Choose 
the fundamental solutions s,, s. of Hill’s equation L[u] =1-wu so that 
$,(0) = =0, s.(0) =s’,(0) =1. Then we have 


+ 1,1) = det (hy (1)) 1, 


where hj,(1) are holomorphic functions of J and hy,(1) = Let z.(1) 
be the roots of the secular equation 


| 2° — hye (1) | 2? — (1) + 1—0, (2r(1) = haa (1) + hoa 


such that | z,(7)| = 1, | z-(/)| <1, and f.(1) be the corresponding solutions 
of the linear equations: 


Au(l) ha(l) =2-f, Riz (l) f + 
Then, putting 
(3. 1) 1) = 1) + f.(1) 
we obtain the solutions g. of L[u] —1-u having the following properties: 
(3. 2) +1,1) =2.(1) 9.(2,1). 


Since, in general, | z,(/)| > 1, | 2.(1)| <1, we infer, by (3.2) and (3.1), 
that our LZ belongs to the case I and f..(1) =f.(l). Hence we get 


(1) hoo(l) 


is therefore 0 if and only if —1< 0, since Aj, (A) 
and r(A) are real for real A. Now, it is known” that the equation 
7°(A) —1=0 has infinitely many roots Ao, Ai, A2,* such that 


<< Ar Ags < domes S* Am + (M—> ) 


and that 


2° Many examples of applications to the spectral theorem are to be fouad in Titch- 
marsh [13]. Here we shall consider Hill’s equation, which has been treated by Wintner 
[20], [18]. Wintner has determined the spectrum of this equation; above, the matrix 
P(A) determining the spectral resolution and, therefore, the spectrum, will be obtained. 
21Cf. Strutt [12]. 


KUNIHIKO KODAIRA. 


> 0, for A < Ao and Agm-1 <A < 


— 1 
7*(A) < 0, for Ass <A < 


Using the spectral formula (1.15), we get therefore, from (3. 3) 


dP(X) (for Arm => = 
0, (otherwise). 
Thus we obtain, by (1. 23), 


Nom 


m=0 om 


A) sey, 2) 


— (x, A) — ] (s2(@, A) 
+ A) s2(y, A) ) }dy. 


The operator H = — d?/dx? + q(x) has therefore no point spectrum and the 
continuous spectrum of H consists of infinitely many intervals [Aom, Aoma] 
(m = 0,1, 2,-- -). 


4. The case that f,(1) is a meromorphic function. As one _ readily 
verifies, fa(1) is a meromorphic function or not, independently of the choice 
of the system of fundamental solutions s,, s., and, in case ZL is of the le. 
type at a, of the boundary condition. Jn case fa(l) 1s a meromorphic function, 
the system of fundamental solutions s,, s. can be chosen so that fa(l) = © 
identically in 1. Such a system of fundamental solutions will be called normal. 
In case L is of the l.c. type at a, we have fa(l) = oo if and only if s,(I) 
satisfies [was,(1)](a) —0; in case L is of the l.p. type at a, fa(l) = © 
if and only if 


(a<c<b). 


Now, in case fa(1) = identically in 1, the characteristic matrix M (1) 
has the form 


Thus, in this case, the spectral theorem becomes as follows: 


934 

For 
(4. 
p(A 
| righ 
E(2 
| whe 
and 
sent 
whe 
| dom 
| (4. 
p(A. 
(4... 
whe 


EIGENVALUE PROBLEMS AND S-MATRICES. 935 


THEOREM 4.1. (SPECIAL ForM oF THE SPECTRAL THEOREM.) Assume 
that fa(l) = 0 identically in 1. Then fy(l) is regular analytic in 310. 
For every real number X there exists the limit: 


A+ 
(4. 1) p(A) = lim lim nsfo(A + te) dA. 


5->+0 €->+0 


p(A) ts a@ monotone non-decreasing function of X which is continuous on the 


Fright. Let (Xr) be the spectral representation of H, and put 


E(A) = E(A) — E(u) for every finite interval A=(p,dr]. Then, for 
arbitrary ue we have 

b 
(42) E(A)u(r) = fu(y)ay s(x, 

a JA 


where 


b 


and thé integral [ dy in (4.2) converges absolutely. u(x) itself is repre- 
sented as follows: 


(4.3) u(x) = lim f u(y)dy f 
A+ 00 a 


where the limit converges in the mean. If, especially, u belongs to the 
domain of H, we have 


b 
(4.4) Hu(x) = lim f u(y) dy f S1(@, A)si(y, A)Adp(A). 
a 


© 


p(A) is represented also in the following form: 


(4. 5) p(A) =— hm lim f dl, 

~€>+0 C (u,a;€) 
where C(p, %,€) means the contour consisting of two oriented polygonal lines 
whose vertices, in order, are »+ie, p+ ia, ta, te, and —te, —ta, p— 
u—ie, respectively, the real number « being subject to the inequality a >«. 
For every v > 0, the residual term 


eee 


936 KUNIHIKO KODAIRA. 


is regular in the I-plane except for real 1 such that l1S=—vorl=v. Thus 
the singularities of f,(l) in the finite l-plane are completely determined by 
p(A). Lspecially, every isolated singular point of fy(l) is a pole of the order |, 


Again, the formulae (1.24), (1.25), ete, and Theorem 1.4 are 
accordingly simplified as follows: Consider A-measurable functions ¢()), 
define the norm of $(A) by 


Il ¢ 


f 
and introduce the Hilbert space $* = {¢ | || ¢ || < + co}. Then we have 


THEOREM 4.2. The transformation 


(4. 6) u(x) > = fs (x, A)u(x) dx 

a 
is a unitary transformation mapping § on $*, whose inverse is given by 
where the integral in (4.6) or (4.7) converges in the sense of || || in § 


or $*, respectively. 


By virtue of (4.6) and (4.7), the formula (4.3) can be rewritten as 
follows : 


(4. 8) u(x) = d)dp(a) 


Again, (4.4) shows that the transformation u—q¢ transforms H into the 
“diagonal form” dA X. 

In the case that f,(1) and f,(l) are both meromorphic functions, the 
structure of H is a very simple one. To see this, we choose s,, 82 so that 
fa(l) = 0 identically. The corresponding f,(b) is then a meromorphic 
function which is regular in 3140. Denote the poles of f,(l) by Am 
(m = 1, 2,3,---) and put 


Then we have, by (4.5), 


(4. 9) p(A) = pm 


T 


Sc 


su 


b 
( 
| 
fi 
L 
a 
tl 
il 
ne 
A 
fu 
| 

a, 
W! 
= 

| 


EIGENVALUE PROBLEMS AND S-MATRICES. 937 


while, since the poles A», are all of the order 1, pn > 0. The formula (4. 8) 
becomes therefore as follows: 


(4.10) = Am)om Am) u(y dy. 


Thus, in this case, /J has no continuous spectrum and the point spectrum 
of H is a discrete set consisting of dy, A2,A3,° Conversely, if H has no 
continuous spectrum and the point spectrum of H is a discrete set, fa(1) and 
f,(1) are both meromorphic functions, since, by (1.19), Mj,(1) are mero- 
morphic functions. We conclude: 


THEOREM 4.3.°? H has only the discrete point spectrum tf and only 
if fa(l) and fp(l) are both meromorphic functions. 


What is the condition for f,(1) to be a meromorphic function? Jn case 
L is of the l.c. type at a, fa(l) is always a meromorphic function, as was 
already mentioned in Theorem 1.2; while, in case L is of the 1. p. type at a, 
the situation is complicated. Assume, for simplicity’s sake, that L is analytic 
in a neighborhood of a, i.e. p() and q(x) are analytic functions in a 
neighborhood of a. Then we have, for example, 


THEOREM 4.4. Assume that a>—o and a is the regular singular 
point of the differential equation L[u] =0. Then, fa(l) is a meromorphic 
function, af 

(x —a)*/p(x) > 0 (ta). 


Combined with Theorem 4. 3, this yields immediately the following 


THEOREM 4.5. Assume that L is analytic, <a, b<+o and 
a,b are both regular singular points of the differential equation L[u] = 0. 
Then H has only the discrete point spectrum, if 


(2—a)?/p(2) +0 (t—> a), (x—b)*/p(x) > 0 (40d). 


5. Schrédinger equation. Let us consider the differential operator of 
Schrodinger type: 


L = — + v(v + 1)/e? + V(2), 0), 


where y is a real number =—# and V(z) is a real continuous function 
such that 


*? Cf. Titchmarsh [13], Chap. IT. 


| 
| 


938 KUNIHIKO KODAIRA. 
V(x) = O(2x***), for V(x) =[«+ O(2r*)]/2, for m, (e>0), 


Then the system of fundamental solutions s;, s, of L[w] = lu (satisfying 
(1.2)) can be chosen so that, as x0, 


s,(,1) 82,1) ~(2v + > —4; 
82 (x, 1) ~— log x, v = — 
We conclude therefore that, at 0, L is of the I. p. type, if v=4, and of the 


l.c. type, if v< 4%. In case v=}, the system of fundamental solutions s,, 
S2 defined above is normal (i.e. fa(l1) = ©), since 


+ o. 


In case v < 3, we take, as the boundary condition at 0, 
(5.1) [wou](0) = 0, W(x) = 8, (2,0) ; 


then the system s;, s2 is also normal. Thus Theorem 4.1 and Theorem 4. 2 
can be applied here. 

In order to investigate the asymptotic behavior of the solution of 
L[u] =1-u at o, we put 


Then we have 


THEOREM 5.1. If 3k 20, the equation L[u] —k?u has one 
and only one solution u(x,k) such that 


(5. 2) u(x, k) ~ exp[ikx — hia/k log x], for ©. 


As functions of two variables x, k, u(x,k) and u’(2,k) are continuous in 
Sk=0, k~0; as functions of k, and w'(2,k) are 
regular analytic in > 0. Furthermore we have 


(5. 3) u’(x,k) ~ (d/dx) exp[ika — (tia/k) log x], for oo. 


Proof. Put u—v- exp[ike — (4ia/k)logxz]. Then the equation L[u] 
= k?u is reduced to 


(A) [p (x, — 


wl 
bo: 


: 4 
| 
0. 

| | 
| W 
th 
Wi 
0 
ar 
| 
(5. 
A( 
(5. 
| He 
con 


ws 


EIGENVALUE PROBLEMS AND S-MATRICES. 


where 
{ p(a, k) = exp[2kia — (ia/k) log x], 
V* (x, k) = [a/2ik + + | 


thus we have first to show that the differential equation (A) has one and 
only one solution v = v(z2,k) satisfying This can be 
readily done by transforming the differential equation (A) into the integral 
equation 


(B) = g(a, k) — (a, y, k)v(y) dy, 
where g(a, hk) = (1— «/2k*x) and 


K (a, y, &) = (p(a, k)/p’ (x, k)) {1 — p(y, k) /p (x, k) V*(y, k) 
+ (d/dy) (p’(y, &)/p(y, &))}, (Sy). 


Furthermore, in the well-known way, solving the integral equation (B) by 
the method of iteration, we infer readily that the solution v(z,k) of (B) 
with ©) and its derivative v’(7,k) are continuous in 
0, k= 0, k~0 and that, as functions of k, v(z,k) and v’(2, k) 
are holomorphic in $4 >0. Thus Theorem 5.1 is proved.?® 

From (5.2) follows 


(5. 4) | u(x,k)|? o according as Jk = 0; 
1 


whence we conclude that Z is of the 1. p. type at o. We need therefore no 
boundary condition at oo. 


Put now 
u(r, k) = A(k)s.(z,1) — B(k)s,(2, 1), (k? == 1, 3k = 0). 
Then we have, by (1.10,) and (5.4), | 
(5. 5) fo(l) =— B(k)/A(k), (k? == 1, 3k = 0). 
A(k) and B(k) are given by 
(5.6) A(k) =[u(k)s:(k)], [w(k)s2(#)]. 
Hence, by Theorem 5.1, A(k), B(k) are regular in $k > 0 and continuous 


*3 After the variation of constants employed in the proof above, the theorem 5.1 is 
contained in a result of Bocher [2]; ef. also Wintner [17], [19], [21]. 


E 


940 KUNIHIKO KODAIRA. 


in Jk 20,k~0. Again, since u(x, k) is not identically 0, and B(k) 
have no common zero point. We conclude therefore, by virtue of (5.5), 
that f.(l) is meromorphic except for /=0 and every pole of f.(1) which 
is not = 0 corresponds one-to-one to the zero point of A(k), since, by the 
relation |= k*, k with 3k > 0 corresponds one-to-one to / which is not = 0. 
On the other hand, by Theorem 4.1, every pole of f.(1) lies on the real axis 
and is of the order 1. Hence, in 3k > 0, all zero points of A(k) lie on the 
imaginary axis and are of the order 1. Denote these zero points by 
km=1|km|, (m=1,2,3,-- +); then the poles of f.(l) which are not 
= 0 are given by Am = k?m =— | km | ?, (m 2,3,- +). Now, for real 
k 0, we conclude from (5.2) and (5.3) the formulae 


(5.7) u(a,—k) =u(2z,k), (k>0); [u(—k)u(k)] = (k>0); 


which yields 


(5.8) (—k)=A(k),  B(—k) = B(h), (k > 0), 
(5. 9) A(k)B(—k) — A(—k) B(k) = ik, (k>0). 


Hence A(k) and B(k) do not vanish for real k #0. 


Now, using these results, we can readily calculate the function p(A). 
First, for 4 > 0, we have 


while, by (5.8) and (5.9), 
AC) ( > 0). 


Hence we obtain 
dp(A) = (2/r) (k?/| A(k)| *)dk, (k=-,A> 0). 
Secondly, for 4< 0, p(—0) —p(A) is, by virtue of (4.5), equal to the 


sum of the residues py of —f.(1) at the poles Am, A< Am < 0, i. €., 


(5.10) p(—0)—p(A) > pm, Where pm =1| km | B(km) 
Kem 


<Am <0 


Obviously pm is positive. Finally, putting po = p(0) —p(— 0), we have 


or 


pe? 


whe 


wher 
| (m= 


(5. 
| | 
H 
poin 
eige 
Putt 
(5.1 
read « 
| 


EIGENVALUE PROBLEMS AND S-MATRICES. 941 


po = — (1/271) lim f(t) dl, 
| 
or 


(5.11) J {B (ce!) /A (co?) 
€—>+0 0 
pe iS = 0. Thus, in this case, the formula (4.8) becomes as follows: 
= 81 (2; Am) pm f sy, Am) u(y) dy 
m= 


+ (2/r) si(x, 
0 
where A» 0. We conclude therefore 


THEOREM 5.2.° The point spectrum of the Schrodinger operator 
H = — d*/dz* + v(v+1)/2?+ V(x) consists of all points An 
= — | kn |?, (m=1,2,3,-- -) end, eventually, of the point The 
continuous spectrum of H is the interval [0, 0), if 0 does not belong to the 
point spectrum, and (0, ©), in the other case. The corresponding “ normalized 
eigenfunctions” are given by 


thea (2) == Am), m—=0,1,2,- 


Uy (©) = (2/7)? | k/A(K)| = 0, 


(5.11), and pm>0O 
functions, every square 


where pm are the quantities defined by (5.10) 
(m=1,2,:--), By means of these eige 
summable function u(a) can be expanded in the fpllowing form: 


u(x) = > Um (z) Un(y)u(y)dy + U; (xp dk f Ux(y)u(y) dy. 
m=0 0 0 0 
For k > 0, we conclude, using (5.7), 


(5.12) =1/2ik{A(—k) u(2,k) —A(P)u(a,—k)}, 0). 


Putting 


(5. 13) | A(—k) = A(k) =| A(k)| (k>0), 


*4 Cf. Titchmarsh [12], Chap. V. The location of the continuous spectrum can be 
read off from more general theorems; see Wintner [18}; Hartman and Wintner [4], [5], 


[6]; Hartman [3]. 


) 
le 
), 
is 
le 
rt 


942 KUNIHIKO KODAIRA. 
we have therefore 
Ux (L) = (2/7) 8(21) u(x, k) — (x, — }. 
This yields, combined with (5.2), the asymptotic formula: 
(5.14) u(x) ~ (2/r)' sin [ka — a/2k log x + 8(k) ], (x—> 00). 


6. Heisenberg’s S-function. In this section we shall investigate the 
Schrédinger operator treated in 5 under the following assumption: 


AssuMPTION I. The functions u(2,k) and w'(z,k) introduced in 
Theorem 5.1 can be extended analytically over the whole lower half k-plane 
across the negative part of the real azis. 


Under this assumption, the functions A(/) and B(k) are also extended 
analytically over the whole lower half plane across the negative part of the 
real axis, and the relations (5.8), (5.9), (5.12) proved for k>0 are 
extended for Yk > 0, i.e., we have 


(6.1) A(—k)=A(k), B(—k) =B(k), (Sk > 0), 

(6.2) A(k)B(—k) —A(—k)B(k) 2ik, (Sk > 0), 

(6.3) 8,(z, — (2ik)*{A(—k)u(a, k) —A(k)u(2,—k)}, (Sk > 0). 
Now we make furthermore 


AssuMPTION II. u(a2,k) and u’(x,k) are regular in the whole k-plane 
except fork=0. 


Then the functions A(k) and B(k) are also regular in the whole k-plane 
except for k=0. Furthermore, using (6.1), (6.2) and (6.3), we can 
eliminate the function B(k) from the formula (5.10). In fact, inserting 
k=km in (6.2), we get A(—km)B(km) =2|km|, which shows that 
A(—km) is real, since B(km) =B(km) by (6.1). Putting 


(6. 4) S(k) =A(—k)/A(k), (Sk 2 0,k A0), 
we obtain therefore 


pm (2/m) | Km/A(— em) | 8 (k) dk. 


| 


ure 


ne 


ne 
an 
ng 
at 


EIGENVALUE PROBLEMS AND S-MATRICES. 943 
Again, from (6.3) follows 
81 (2, Am) = 4A (— km) /| em | em). 


Hence the normalized eigenfunction u»(x) is represented as follows: 


= — Cm km), where Cm? = S(k) dk. 
km 


S(k) will be called the Heisenberg’s S-function, since it is a diagonal 
element of the Heisenberg S-matrix.* By virtue of (5.13), we have 
S(k) = ce, (k >0). Hence S(k) is obtained, under the Assumption I, 
from e?#*) by analytic continuation, if one knows the “phase shift” 8(k) 
from the asymptotic behavior of the “ normalized eigenfunction ” u;(r). Under 
the Assumptions I and II, S(k) is meromorphic in 3k > 0, has poles 
km =1|km| (m=1,2,---) of the order 1, and except for these poles, 
S(k) is regular in Sk > 0, since A(k) and A(—+k) are regular in Jk >0 
and have no common zero point, as one sees from (6.3). Thus the following 
theorem of Heisenberg has been strictly founded : 


THEOREM 6.1 (HEISENBERG).7° Let u,(x), k > 9, be the solution of 
the Schrodinger equation: 


d?u/dx? + [k? —v(v+1)/2? — ]u=0, (0<4@< 
satisfying the “ boundary condition” 


ux; (x) ~ const. 


and having the asymptotic form 
ux (x) ~ (2/r)* sin [ka — log x + 8(k)], (x—> 0). 


Then, under the Assumptions I and II, the function 8(k) = e?®™ can be 
extended analytically over the whole upper half k-plane. .The extended S(k) 
has, on the imaginary axis, the poles km==1|km| (m*==1,2,---) of the 
order 1, and, except for these poles, S(k) is regular in 3k > 0. The negative 
eigenvalues of the Schrédinger equation are given by Am = km? = — | km | *, 
(m =1,2,- -) and the corresponding normalized eigenfunctions Um(x) have 


the asymptotic expressions: 


*5 Heisenberg [7]; see also Pauli [10], Ma [9], Bargman [1]. 
26 Heisenberg [7], Part III and IV. Cf. also Ma [9], Pauli [10]. 


he 
in 
ne 
ed 
he 
)). 
| 


944 KUNIHIKO KODAIRA. 


Um (2) ~ Cm/(2r)* exp [— | km | 4/2 | km | log x], (x4 —> 00), 


where the constants Cy are given by Cy? = p S(k)dk. 
Km 


It is an open question whether the Assumption I is valid for every 
Schrédinger equation or not. If Assumption I were not valid, it would 
be impossible to define S(k) for 3k >0. In case the Assumption II ts not 
fulfilled (while the Assumption I is valid), S(k) has, in general, singular 
points in Sk > 0 other than km (m= 1, -), arising from the numerator 
A(—k) of S(k). Thus, in this case, it might be impossible to determine 
the negative eigenvalues of the Schrodinger equation from the singular points 
of S(k).** The necessary and sufficient condition for V(x) in order that the 
Assumptions I and II are valid is not known. But it can be proved that 
the assumptions I and II are valid if the potential V(x) is analytic in a 
neighborhood of o. ‘This asserts the validity of Heisenberg’s theorem 
for Schrédinger equation appearing usually in application. 


BIBLIOGRAPHY. 


Bargman, V., “ Remarks on the determination of a central field of force from the 
’ Physical Review, vol. 75 (1949), pp. 


[1 


elastic scattering phase shifts,’ 
301-303. 

Bocher, M., “On regular singular points of linear differential equations of the 
second order whose coefficients are not necessarily analytic,” Transactions 
of the American Mathematical Society, vol. 1 (1900), pp. 40-52. 


[2 


[3] Hartman, P., “On the spectra of slightly disturbed linear oscillators,” American 
Journal of Mathematics, vol. 71 (1949), pp. 71-79. 

[4] ——-— and Wintner, A., “ An oscillation theorem for continuous spectra,’ Pro- 
ceedings of the National Academy of Sciences, vol. 33 (1947), pp. 376-379. 


[5] - and Wintner, A., “ On the location of spectra of wave equations,” American 
- Journal of Mathematics, vol. 71 (1949), pp. 214-217. 
[6] and Wintner, A., “A criterion for the non-degeneracy of the wave 


equation,” American Journal of Mathematics, vol. 71 (1949), pp. 206-213. 
[7] Heisenberg, W., “ Die beobachtbaren Gréssen in der Theorie der Elementar- 
teilchen,” I, Zeitschrift fiir Physik, vol: 120 (1943), p. 513; II, ibid., p. 673. 
III, IV, unpublished. 
[8] Jost, R., Helvetica Physica Acta, vol. 22 (1947), p. 256. 


27S. T. Ma has pointed out this fact by an example. Cf. Ma [9]. 


[ 
[ 
[ 
[ 
[ 
[ 
[ 
[ 
| 
[ 
* ° 


[9] 


[10] 
[11] 


[12] 


[13] 


[14] 


EIGENVALUE PROBLEMS AND S-MATRICES. 945 


Ma, S. T., “On a general condition of Heisenberg for the S matrix,” Physical 
Review, vol. 71 (1947), pp. 195-200. 

Pauli, W., Meson theory of nuclear forces, New York (1946). 

Stone, M. H., Linear transformations in Hilbert space, American Mathematical 
Society Colloquium Publications, vol. 15 (1932). 

Strutt, M. J. O., Lamésche, Mathieusche und verwandte Funktionen in Physik und 
Technik, Ergebnisse der Mathematik und ihrer Grenzgebiete, vol. 1, no. 3 
(1932). 

Titchmarsh, E. C., Higenfunction expansions associated with second-order differ- 
ential equations, Oxford (1946). 

Weyl, H., “ Uber gewéhnliche Differentialgleichungen mit Singularitéten und die 
zugehérigen Entwicklungen willkiirlicher Funktionen,” Mathematische 
Annalen, vol. 68 (1910), pp. 220-269. 

——, “Uber gewdhnliche Differentialgleichungen mit singulairen Stellen und 
ihre Eigenfunktionen,” Géttinger Nachrichten (1910), pp. 442-467. 
———,, “Uber das Pick-Nevanlinnasche Interpolationsproblem und sein infinite- 

simales Analogen,” Annals of Mathematics, vol. 36 (1935), pp. 230-254. 


Wintner, A., “ Asymptotic integrations of the adiabatic oscillator,” American 
Journal of Mathematics, vol. 69 (1947), pp. 251-272. 

———, “On the location of continuous spectra,” American Journal of Mathe- 
matics, vol. 70 (1948), pp. 22-30. 

——., “On the normalization of characteristic differentials in continuous 
spectra,” Physical Review, vol. 72 (1947), pp. 516-517. 


> 


——., “Stability and spectrum in the wave mechanics of lattices,” Physical 
Review, vol. 72 (1947), pp. 81-82. 

———. “Asymptotic integration of the adiabatic oscillator in its hyperbolic 
range,” Duke Mathematical Journal, vol. 15 (1948), (IV), pp., 55-67. 


|_| 
ry || 
ld 
ot 
| 
ur 
= 
le 
le [15] 
it 16) 
a 
n [17] 
[18] 
[19] 
[20] 
[21] 
| 


EXPANSION METHODS FOR THE ISOPERIMETRIC PROBLEM OF 
BOLZA IN NON-PARAMETRIC FORM.* 


By T. REID. 


1. Introduction. In a previous paper [15]! the author derived an 
effective Lindeberg theorem for isoperimetric problems of Bolza type in non- 
parametric form, and with the aid of this theorem established sufficient 
conditions for a strong relative minimum. The author has felt, however, that 
the sufficiency theorem of [15] should be derivable by an extension of the 
expansion method of proof that he had used earlier for ordinary problems 
of Bolza in non-parametric form ([12], [13]), and without the aid of an 
auxiliary “Lindeberg theorem.” In Part I of the present paper such a 
proof is presented. 

Section 2 is concerned with the formulation of the problem, while 
certain preliminary results for the expansion method of proof are given in 
Section 3. In Section 4 the increment of the functional to be minimized is 
expanded in terms of arbitrary “slope-functions ” and “ multipliers” satis- 
fying certain continuity properties. Section 5 is devoted to the proof of the 
sufficient conditions for a strong relative minimum as stated in Theorem 2. 1. 
Section 6 contains a brief discussion of an “ Osgood theorem ” for the problem 
under consideration. 

Theorem 2. 1 is equivalent to the sufficiency theorems proved by Hestenes 
[5] and Reid [15]. It is to be pointed out, however, that for a problem 
equivalent to the one herein treated, Hestenes ([7], [8]; see, in particular, 
the statement on page 510 of [8]) has shown that conditions weaker than 
those of Theorem 2.1 suffice to insure a strong relative minimum. In 
contrast to the method of the present paper, that of Hestenes is indirect, 
and uses results which are in the nature of a Lindeberg condition in terms 
of the. Weierstrass €-function as derived by the author [15]. It is to be 
remarked that for the ordinary problem of Bolza the method of the present 
paper provides simplification of details in the expansion proofs previously 
given by the author in [12] and [13]. 


* Received February 26, 1949; presented to the American Mathematical Society, 


September 4, 1947. 
1 Numbers in square brackets refer to the bibliography at the end of this paper. 


946 


i 


OF 


y 


EXPANSION METHODS FOR THE PROBLEM OF BOLZA. 947 


The proof of Theorem 3.3, which is fundamental for the expansion 
proof of Section 5, is the goal of Part II of the present paper. This part 
has been written in such a manner, however, that to a large extent it may be 
read independently of Part I. In addition, other results have been included 
which are of importance in themselves. For example, Theorem 8.2 extends 
earlier results of Radon ([9], [10]) and the author [16] on the relation 
between the positiveness of the second variation of a Bolza problem and the 
existence of certain types of solutions of the associated Legendre matrix 
differential equation of Riccati type. The indicated proof of Theorem 8. 1 
shows that with the aid of a preliminary alteration it is possible to eliminate 
the condition of non-tangency used by Hestenes [3] and Bliss [2] in the 
proof of a corresponding result. Theorems 11.1 and 12.1 are useful in the 
treatment of certain types of self-adjoint boundary problems. 


Part I. Sufficient Condition for a Strong Relative Minimum. 


2. Formulation of the problem. The problem to be considered is that 
of minimizing a given functional 


(2.1) J = (21, Lo, y(@2)) + fie y, y’)dx 

in a class of arcs ” 

(2. 2) yi(x), (tan 
satisfying auxiliary first order differential equations | 

(2. 3) ba(z,y, 
the end-conditions 

and the isoperimetric conditions 


(2. 5) Js 9s(%; y(21), y (2) ) + f Y; y’)dx 0, 
(s=1,-- 
For brevity, this minimum problem will be referred to as “ problem B.” 


It is supposed that there is given an open region @ of points (z, y,1) 
= (2, Yn) °°; Tn) in which the functions? f(z,y,7), ¢a(z,y,7), 


2For the analysis of this paper it actually suffices to assume that the functions 
f, $a f, are of class C® in 9g. Under this weaker assumption, however, the accessory 


an 
on- 
ent 
at 
he 
ms 
an 
a 
ile 
in 
is 
is- 
he 
les 
ur, 
an 
In 
t, 
nt 
ly 


948 WILLIAM T. REID. 


fs(a,y,7) are of class C“), and the m Xn matrix of partial derivatives 
| dar,(2,y,7) || is of rank m. We suppose also that there is an open region 
D in the (2n + 2)-dimensional (2, yi:, 22, yiz)-space in which the functions 
J. Ww, gs are of class C), and, moreover, the matrix of partial derivatives 


Yin 


has rank p. A set (2, y,1), or (21, Yi1, %2, Yi2), Will be said to be “ admissible ” 
if it lies in @ or D, respectively. Corresponding to the terminology of Bliss 
[2; p. 194], an are (2.2) whose defining functions are continuous and have 
piecewise continuous derivatives will be termed admissible if its elements 
(x, y(r), Sx Sas, and (x, yi(%1), 2, y(@2)) are admissible. 

If we set 


F(x, x, Aof (2, r) r) + Asfis (2, r); 
where repetition of a subscript in a term denotes summation with respect to 
this index on its specified range, an extremal /7 is defined as an admissible 
are of class C’), and a set of multipliers A» = constant, Ag(x) of class C™, 
As = constant, such that along £, 


dF,,jdz—F,,—0,  ¢,—0. 
An extremal H: y; (2), Ao, Aa(X),As is said to satisfy the multiplier rule with 
constants ev if the relation 
+ Fr,dyi + dodg + Asdgs + evdpy = 0 


holds for every choice of the differentials dz,, dyi:, dr, dyiz. An extremal 


is termed non-singular if the matrix 


ar; a, 


is non-singular along this extremal. 


Throughout this paper we shall be concerned with an extremal having 
the leading multiplier Xr» different from zero, and, hence, without loss of 
generality, taken equal to unity. Such an extremal £ is said to satisfy the 
Weierstrass condition IIy if N is a (2n + m.+ q+ 1)-dimensional neighbor- 
hood of the elements (2, yi(r), y’i(),Aa(),As) of such that 


differential equations may not be expressed as linear equations of the second order, but 
may be written in canonical form. 


| 
| | 


EXPANSION METHODS FOR THE PROBLEM OF BOLZA. 


(2. 6) E(2, A;7) = F(z, A, A) — F(z, A, x) 
— (* — ri) y, 7, 20 


for arbitrary (v,y,r,A,A) of N and all *=(#;) such that (a,y,r) and 
(a, y, #) are distinct admissible sets which satisfy da(z, y, 7) = 0 = da(a, y, 7), 
(*—1,---,m). As usual, the condition obtained from JJy by deleting 
the equality sign in (2.6) is referred to as I/’y. Condition JZy and non- 
singularity imply the strengthened Clebsch condition, J/I’: at each element 
(x, y(x),y’(@),A(x),A) of # the quadratic form F,,,-,7izj is greater than 
zero for arbitrary sets (0;) satisfying = 0, =1,° m). 
Moreover, if / is non-singular and satisfies J7y, then it also satisfies I1’y if 
N is properly restricted (see [4]). 

In the terminology of Bliss [2;p.195], a pair of constants &, &2, 
together with an n-tuple (x) = (mi()) of functions which are continuous 
and have piecewise continuous derivatives on x, <x <2,, will be called an 
admissible variation. The equations of variations of (2.3), (2.4), (2.5) 
along a given admissible are C are, respectively, 


(2. 7) (2, 1’) pars j = 0, 
== div (E1,9(41) + (21), S25 + = 0, 


(2.9) 93 T2) Gs (E1, (21), &2, ) 


+ (fersn’s feyjnj) dx = 0, 
where in (2.9), 7 


= dgs (#1) + (21), + Ey’ (#2)) + feé > 

in each of the expressions (2.7), (2.8), (2.9), (2.10) the arguments of the 

partial derivative functions are those given by the elements (a, y(2), y’(2)) 

of @. As in Bliss [2], corresponding numerical subscripts denote the 

coefficients of the respective variables &, 7: (#1), &,i(@2) in W and Gs; 

specifically, 


(2.11) W(é, n(@1), &2,9(@2)) = + Wy, (%1) + + Wy, 529) (#2), 


with a similar notation for the coefficients of Gs, (s=1,°- -,q). 


For an extremal E: yi(x), Ao=1,Aa(Z),As, which 
satisfies the multiplier rule with constants ev, the second variation is written 


q 949 
TES | 
Oll 
i 
ns | 
SS 
ts 
ec 
) 
h 


950 WILLIAM T. REID. 


(2. 12) J2(E, ”) 27 (41, (21), £2, ) 


In this expression 
Rw (2, 7) = + 2F + 
where, as a function of (dx, dyi:, dyiz), 
2H dyis, dx2, dyiz) = [(Fe— y'iFy,) + 2Fy,dyidx]? + 26, 


and 2@ is the quadratic form in (da, dyi;, dr, dyi2) whose coefficients are 
the respective second order derivatives of g=4g + ewv+Acgs, while in all 
terms the partial derivatives of / and of g are evaluated along L. 

An extremal satisfying the multiplier rule will be said to satisfy con- 
dition IV also if along this extremal J.(é,7) 20 for arbitrary non-null 
admissible variations = (é1, which satisfy the corresponding 
equations of variation (2.7), (2.8) and (2.9). As usual, the condition 
obtained from the statement of IV upon replacing “Jo(é&) 20” by 
“Jo(é7) = 0” will be referred to as condition IV’. As is well-known, 
non-singularity and IV imply condition IIT’. 

If we set 


(2. 13) i) = o(2, 7) -+- PaPa (2, 7) + Bs (farm; fevni)s 


an accessory extremal is defined as a set of functions »;(#) of class C?, 
and a set of multipliers of class constant, such that 


(2. 14) (d/dr) Qn, (2, UB Bs — On, (2, UB = 0, (2, = 0. 


Along a non-singular extremal equations (2.14) may be written in terms of 
the canonical variables (2, mi, = Ox, (2, 7, @)) as 


(2.15) =D =— Dn, 


where § is the corresponding Hamiltonian function involving the parameters 


fis, (s=1,- i 
The sufficiency theorem to be proved in this paper is as follows. 


THEOREM 2.1. Suppose that E: yi(x),Ao=1,Aa(Z) As, (%11 SUS 42), 
is a non-singular extremal satisfying (2.4) and (2.5), and that E satisfies 
with constants ey the multiplier rule and conditions II’y and IV’. Then 
there exists a neighborhood % of E in xy-space and a neighborhood D of the 


C 
t 
t 
( 
i 


EXPANSION METHODS FOR THE PROBLEM OF BOLZA. 951 


end-points of E in (21, Yi1, V2, Yiz)-space such that J(C) >J(E£) for every 
admissible arc C satisfying (2.3), (2.4), (2.5) which lies in %, has end- 
points in D, and is not identical with E. 


3. Preliminary results. As shown by Reid [15; Sec. 3], if # is an 
extremal for the problem B satisfying III’ then one may assume without loss 
of generality that H satisfies the following condition III*: Along EF the 
quadratic form F’,,,,riz; is positive definite. More precisely, if 


(2, r) f (2, r) + 1) ba(2, 


and B* denotes the minimum problem in which (2.3), (2.4), (2.5) are as 
in problem B, but (2.1) is altered by substituting f* for f, then for a 
suitable choice of the constant / the corresponding quadratic form F*,,,,r47j 
for B* is positive definite. An are (2.2) clearly affords a minimum for B 
if and only if it affords a minimum for B*; also, yi (x), Ao, As is an 
extremal which satisfies with constants ey the multiplier rule for B if and 
only if # is an extremal for B* satisfying the corresponding multiplier rule 
with the same constants ey. In addition, for an extremal with Ay = 1 each of 
the following conditions holds for problem B if and only if the condition 
holds for problem B*: II, II’y, non-singularity, III, III’, IV, IV’. 

The following theorem on the behavior of the Weierstrass €-function 
involves the non-negative monotone non-decreasing convex function 


(3.1) R(t) —#L/(t+1), 
As noted in Reid [15; pp. 681, 683], this function has the following properties: 
(i) R(t) S Min S 2H (2), (¢{=0); 
(3.2) (ii) @ Min (1, a)R(t) S Rat) Sa Max(1,a)R(t), (@20, t=0); 
(iii) R(t +t.) + (t, = 0, = 0). 


From the Corollary to Theorem 3.2 and Theorem 3.3 of Reid [15] one 


obtains the following result.* 


TuHeoreM 3.1. Jf yi(%),Ao=1, Aa(X),As is an extremal for a 


8 Since for admissible sets (2, y,7), y,#) satisfying =0= ¢$, (a, 
the Weierstrass function for B is equal to the Weierstrass function for B*, while non- 
singularity and IJ, imply III’ as pointed out above, we have that inequalities (3.3), 
(3.4) contain a proof of the previously stated result that non-singularity and IIy 


imply II’, if N is properly restricted. 


14 


e 
ll 
ll 
g 


952 WILLIAM T. REID. 


problem B which satisfies conditions 111* and Ly, then there exists a neigh- 
‘borhood N of the elements of E in (x, y,7r,A,)-space, and positive constants 
and xo such that if (a, y,7,A,A) is in N then (x,y, 7) is admissible, and if 
(Fi) ts any set such that (2,y,7) is admissible and y,*) =0, 
(a==1,---,m), then* 


(3. 3) E(x, y, 7,A, 437) = F—r |), 
(3. 4) | €s(a, y, 7; F)| S (a, y, 7, A, 237), (s==1,---,q), 


where =fe(2x, y, —fs(2, y, 7) — (Fi — far,(2,y, 7), the 
ordinary Weierstrass function for fs(x, y, 1). 

Using the above inequalities (3.2), one may establish the following 
integral inequality which will be employed in the proof of the sufficiency 
theorem. The proof may be obtained by the same type of argument as given 
for Theorem 5.1 of Reid [12], (see also Lemma 4.1 of Reid [15]).° 


THEOREM 3.2. Jf h(x), (7 absolutely continuous 
functions on X,; Sx=Xz, and || h(x)|| [8 on this interval, then 


where d, = 2 Max(1, 28) Max(1, X,), = (2d, + 1){X2.—i). 

The following result for the second variation, which is fundamental for 
the expansion proof of Theorem 2.1, is a direct consequence of Theorem 12. 2. 
For an indication of the relation of this theorem to previous results of 
Hestenes and Bliss the reader is referred to Section 7 and the remark 
following the statement of Theorem 8. 1. 


THEOREM 3.3. If E: yi(@),Ao=1, As, (V1 SX SM), is a non- 
singular extremal for problem B which satisfies with constants ey the multi- 
plier rule and condition IV’, then there are constants > 0, ¢=0, x, > 0 
such that on 2,—8& SaSa2.+8 there exists a non-singular matrix 


‘If for an arbitrary integer k we have defined a k-tuple of real numbers 


u= (u -,W,,) the symbol || wu || is used to denote the non-negative square root of 


1’ 
u*i+----+u°*,. In particular, this notation is extended to functionals (2.7), (2. 8), 
(2.9); for example, in (3.9), || || is the non-negative square root of 


5 It is to be remarked that the statement of Lemma 4.1 of Reid [15] contains two 
errors: one in the omission of brackets in formulas (4.1) and (4.2); the other in the 
stated value of the constant d,. 


C 
f 
( 
C 
| 


EXPANSION METHODS FOR THE PROBLEM OF BOLZA. 953 


U(x) = || Uor(x) ||, 7 = +, 2n + q), with elements of class C’, and 
continuous functions War(x), q), satis- 
fying the differential equations 


(3. 7) = 0, Ul + fer,U’ jr fey U = 0, 
(4—1,- ‘,n3s—1,: 


and with the property that if the components of n= (mi(@)) are absolutely 
continuous on X;S where | X,—-2,| <8, | <8, and 


h(x) = (h,(x)) is defined by 


(3. 8) 
= J (feryn's + aa, 
then 
2y 9(X1), 9(X2)) + Xe) 
(3. 9) = wi + £2? + | h(Xi) + |] ?) 
— ¢(|| |] + |] |] 7), 
where 


(3.10) hr (2), = 251) == Bar (2), 
(3.11) Xi, X2) 


Xo 
=2 f 0) + (9 ) Ox, (2, n, 7, 0) }dz. 


4, Expansion of the increment of J(C). Suppose that yi(x), A. = 1, 
S is a non-singular extremal for problem B satisfying 
(2.4), (2.5), as well as the multiplier rule with constants ey. Since FH is 
non-singular, by the existence theorem for differential equations there exists a 
8 >0 such that an extension of EF is defined and non-singular on the 
interval 7; —¥ 8. If 0<8< we shall denote by the 
neighborhood of in (2, y)-space consisting of all sets (a,y) satisfying 
ly—y(a)| <8, 7, —8<a4<a,48. The portion of % in which either 
<x <2x,+8 defines a neighborhood of the 
end-points of E in (21, Yi1, C2, Yiz)-space which will be denoted by Ds. For 
the neighborhoods 5 and Ds used in the following, it is understood that 
they are defined as above, withO <8<&. 

For the set of continuous ares C: Yi(r), (X,;S¢eSX2), where 


| 


954 WILLIAM T. REID. 


< X,, < #248, the symbol | C— will denote the “ distance 
between C and #” defined by 


(4. 1) | C— E| | Xx | + | 22 | + Max x,<r<x, | Y(z) — y(z)|. 


Suppose that for such arcs C there are defined two functionals f,(7;C) and 
fe(z;C) on We write f,(x;C) =of{fs(x;C)} if for each 
> 0 there exists a & > 0 such that | Se|f.(@;C)|, 
for all curves C: ¥;(x), X¥,; SxS YX, with | < &. Correspondingly, 
the condition that there exist positive constants M, 8 such that | f,(x;(C) 
=M | f.(#;C)|, for all ares C with | C— | <8 is written 
fi(v; C) = O{f2.(@;C)}. The above conditions for the particular functional 
fe(x;C) are indicated by —o{1} and = O{1}, 
respectively. 

For the general expansion of AJ =—J(C)—J(EF) given below it is 
supposed that for each admissible are C: Yi(x), (XY, Sa with 
< X,,X2.< x, +8, there exist “ slope-functions” p= (p;(x;C)), 
(t=—1,---,n), and “multipliers” A=(Ag(z;C)), 
which as functions of x are continuous on XY, [2S Xz, reduce to y’;(x) and 


da(z), respectively, if C==£, and as functionals of C are continuous with 
respect to the metric (4.1) in the sense that 


| p(w; C) —y"(x) || =of1}, C) — A(z) 


In the proof of the sufficiency theorem as completed in the following section 
explicit values of pj(x;C) and Ag(x;C) will be specified. For brevity, the 
notation 


Ag = 9(X1, (4X1), X2, ¥(X2)) —g (a1, ) 


is introduced, with the similar meanings for Ags and Ag where, as in 
Section 2, g = g + evv + Acgs, and ev, As belong to the set with which F 
satisfies the multiplier rule. 

Now if C: Yi(x), —¥ << X.<2,+ is an admissible 
are satisfying (2.3), (2.4), (2.5), 


Xe v2 
AJ = Ag+ (2, ¥, f f(x, y, y’) da 
Xy v1 
= Ag + J°—J*— J’, 


(4. 2) 


where 


Xe 
Xi 


{ 
° 


ance 


EXPANSION METHODS FOR THE PROBLEM OF BOLZA. 955 


(4. 4) Jt == (— Be, y; y’, A, A) da, = 1,2). 
Xx 
Writing =ni(7;C) = Yi(v) — yi(@), =pa(t;C) = 


—Ag(z), (X11 Sx=X,), and using the expansion method of Section 6 of 
Reid [12], it follows that 


Xe X» 
f E(x, Y, p, A, A; Y’)dx + f + da 
Xy 
, 
(4. 5) + (Q(z, 7 P— Ys 0) — pi)Ox,(2, p—Y’, u,0)}da 


Xo 
{a(z;C) + (Y's — pi) bi C) }dz, 


where for a given ( the functions a(z;C) and b;(a;C) are continuous in x 
on X,; Sx= Xz, while as functionals of C, 
a(x; C) = (x) + | p(z) 1? + 73, 


It is to be understood that the arguments of the partial derivatives of F 
occurring in the second and third integrals of (4.5) are the elements of the 


(4. 6) 


extremal 

In view of the Kuler-Lagrange equations the second integral in (4.5) 
is equal to — Moreover, by the expansion 
process of Section 6 of Reid [12], it follows that 


(4.7) Ag—J*— J? + Fr; | 


where for each « >-0 there is a & > 0 such that if the end-points of C are 
in the Ds, neighborhood of the ends of H, then | y*(C)|Se{(X1— 2%)? 
4 (X2—2e)? + + In particular, 


(4.8)  y*(C) =o0{(X1— a1)? + (X2— 22)? + n(X1) |] 


Combining (4.5) and (4.7), it results that if C: Y;(z),7,—8 < X; 
<7 X,<27,.+¥%, is an admissible are satisfying (2.3), (2.4), (2.5), 
then 


(4.9) AJ = E(a,Y,p,A,A; Y’)dx + 


Xo 
+ f + C) + 
xX, 


and 
2ach 
Xs, 
gly, 

C)| 

ynal 

1}, ] 

is 

ith 

)), 

n), 

ith 

on | 
in 

E 


956 WILLIAM T. REID. 


where a(x;C), b(z;C) and y*(C) are “remainder” expressions possessing 
the order properties (4.6) and (4.8), respectively, and 


A 


J = 2y(L1 — 21, (X1), X2 — £2, (X2) ) 


(4. 10) X, 
+ 2 foc, 7, —y; 0) + — pi) Qn, (2, 7, Bs 0) 


As in the earlier expansion proofs of the author ([12], [13]), it will be 
proved that for suitable choices of p;(z;C) and Ag(x;C) the first two 
terms of (4.9) are the dominant ones in this expression. 


5. Proof of Theorem 2.1. In view of the comments at the beginning 
of Section 3 we may assume that the extremal Z satisfies condition III* in 
addition to the stated hypotheses of Theorem 2.1, and such assumption will 
be made in this and the following section. Suppose that 8 is positive and 
not greater than either the 8, of Theorem 3.3 or the 8 defined in the 
first paragraph of Section 4, and consider an admissible arc C: Y;(2), 
(X,S2=VYX,), for problem B which satisfies (2.3), (2.4), (2.5), and 
has end-points in the Qs neighborhood of the end-points of EF. Let 
ni(x) = Yi (x) —yi(x), Xi; and consider the functions h(x) 
= (h,(z)), + -+,2n+ defined by equation (3.8) of Theorem 
3.3. For an admissible are C these functions are continuous and have piece- 
wise continuous derivatives; indeed, if the functions Y;(x) defining C are 
merely absolutely continuous then the functions h,(z) are absolutely con- 
tinuous on Clearly || »(x)|| — O{|| h(x)}||; moreover, the 
(2n + q)-dimensional vector 


Xe 
(2) = (Xe), + da) 
satisfies the folowing relations: || h(x) ||—O{|| a(x)||}, and || 4(z)| 
= O{|| h(x)||}. Also, since upon integration by parts, 


Nonss(L) fern; | + (fev, (d/dt) fers) nj dt, 
it follows that 


(x) II}; 


in particular, as functionals of C, || a(x) || —of{1}, and || h(x) || = of{1}. 
Now define functions p;(z;C’) and Ag(x;C) as 

pi(x; 0) =i (x) + (x)h, (x) = + (250), 

Aa (2; C) = da(t) + (x) =Aa(z) + 0), 


(5. 1) || a(x) || — O{Maxx,<e<x, 


(5. 2) 


= 
| 
‘ 


ing 


)I 


EXPANSION METHODS FOR THE PROBLEM OF BOLZA. 957 


in the notation of (3.10). These functions satisfy the conditions prescribed 
in Section 4, since for each admissible arc C with end-points in Ds they are 
continuous on XY, =2= while as functionals of C, 


(5.3) || p(a3C) —y’(x) || = 
| A(x; C) —A(z) || = h(z) |}. 


From the definition of p;(z;C) in (5.2), and the first n equations of 
(3.8), it follows that U;,;h’, = p;, and hence 


(5. 4) | ¥’(x) — p(x; C) || = 

Moreover, in view of (3.7), we have that = 0, 9), 
= — fer,(¥’; — Pj), (8 and consequently 

(5. 5) || h’(x) || = — C) 


Since both C and EF satisfy (2.4), upon expanding the functions of these 
conditions to first order terms it follows that 


| — 21, 7(X,), X2— 22, n( X2) ) |] 
—o{|Xi—m | + + I}. 


In view of (5.1), it follows immediately that 


Moreover, since both ( and £ satisfy (2.5), upon expanding AJ, —J,;(C) 
—J;(E) in the manner analogous to the expansion of AJ in Section 4, and 
using the above stated order relations for || ||, || p—y’ || and || Y’—p], 
it results that 


Xe 
X.(X1 — X2 — V2, ; X1, Y,p; Y’)dr + X*,(C), 
xX, 
with 


Xe 
+ si h(x) ||? + || h(z)|| || h’(x) ||) dz}. 


Consequently, by Theorem 3.3 the functional J, of (4.9) satisfies the 
inequality 


be 
ing 

in 
rill 
nd 
the 
r), 
uet 
r) 
ce- 
ire 
yn- | 
he 


958 WILLIAM T. REID. 


Jy wil (Xi — 21)? + (X2— ae)? + | + | (Le) | 7] 


Xe 


(5.6) 


Now for |C—F| sufficiently small the arguments (2, Y(z), p(x;C), 
A(xz;C),A) of the integrand of the first integral in (4.9) lie in the 
neighborhood 9 of Theorem 3.1. Hence, in view of Theorem 3.1, together 
with (5.5) and (3.2ii), there are positive constants «’, 8” such that if 
C: Yi(x), (¥: S24 is any admissible are for which |C—E| < 8, 
then 


(5.7) J E(x, Y,p, AAs =e’ f h’(t) 


From the expression (4.9) for AJ, together with (4.6), (4.8), (5.6), 
(5.7), the order relations derived above, inequality (3.4) of Theorem 3. 1, 
and Theorem 3.2, it follows that if 0 < <x’, 0 < < «,/2, then there 
is a 86>0 such that it C: Yi(x), (X¥, Sx=X,), is an admissible are 
satisfying (2.3), (2.4), (2.5), and for which |C— #| <8, then 


(5.8) =x” | (| h’(t) ||) dt 
+ [(X1— 21)? + (X2— 22)? + |] h(X1) + | 


Consequently, for such an admissible arc C we have J(C) 2J(E), and the 
equality holds if and only if X,— 2, || h(X;)|| —0, (4 =1,2), and the 
integral in (5.8) is equal to zero, in which case ho(x) =0 and C=E£. In 
particular, the conclusion of Theorem 2.1 holds for § = %» and D=— Ds, 
with 8 suitably small. It is to be remarked that this result remains valid 
for ares C: Yi(x), (X, SxSVX.), lying in § and having end-points in D 
which satisfy (2.4), while Y;(v), (i =1,- -,), are absolutely continuous 
functions which satisfy Y, Y’) almost everywhere on Se 
are such that the integrals in (2.1) and (2.5) exist in the Lebesgue sense, 
and which satisfy (2.5). 


6. An Osgood theorem. In view of the preceding expansion proof of 
Theorem 2.1, the following Osgood theorem for problem B may be estab- 


— 


~ 


ff 


W 


sl 


CO 


EXPANSION METHODS FOR THE PROBLEM OF BOLZA. 959 


lished directly, and without recourse to an associated ordinary problem of 
Bolza, as was done in Section 6 of Reid [15]. 


THEOREM 6.1. Suppose that E is an extremal for problem B which 
satisfies the hypotheses of Theorem 2.1. Then there exists a neighborhood 
@ of E in xy-space and a neighborhood D of the end-points of E in 
V2, Yi2)-space with the following “ Osgood property”: corresponding 
to each neighborhood ® of E interior to %, and each neighborhood D’ of the 
end-points of E interior to D, there exists a constant k >0 such that for 
every admissible arc C satisfying (2.3), (2.4), (2.5) which lies in & and 
has end-points in D, but which does not lie in % and have end-points in D’, 
the inequality J(C) —J(E£) valid. 


Let § = 3s, D = Ds, where § is such that inequality (5.8) holds for 
arbitrary admissible arcs C satisfying (2.3), (2.4), (2.5), and which lie in 
@s and have end-points in D3. For given 3’ and ©’ interior to § and 9, 
respectively, there exists a ry 0<8< 6, such that % is interior to 3’ and 
3 is interior to D’. Consequently, if C is in & with end-points in 9, but 
does not lie in %’ and have end-points in ©’, then either (X,—2,)? 
+ = 8, or there is an on X¥,S2x<X, such that || y(2)| 
= || Y(t) — y(2) | >%. In the first case, it follows from (5.8) that 
J(C) —J(E£) = «’,8. Since || || = O{|| h(x) ||}, there is a constant 
M,> 0 such that | S || h(z)||, (X1 for an arbitrary 
admissible are C: Yi(x), (X, lying in in particular, if 
| | then || h(2)|| =3/My. Now, as shown in Section 6 of Reid 
[15], from (3. 2ii), (3. 2ii1), and Jensen’s inequality, it follows that for an 
arbitrary point on X,; [x= we have 


| || dt = doR(|| h(x) II), 


where dy is a positive constant such that Min (1,1/(X.— X,)) = 2d, when 
| ay, | <8, =1,2). In particular, if there is an on X, 
such that || y(ao)|| = 8, from (5.8) it results that 


J(C) —J(EB) = dy Min (”, «’,)R(5/Mo). 


Consequently, if C is an admissible arc satisfying (2.3), (2.4), (2.5) 
which lies in $ and has end-points in D, but which does not lie in 8 and 
have end-points in 9’, then J(C) —J(E) =&, where « is the smaller of the 
constants «’,5? and dy Min («”, «’,)R#(8/M,). 


= 


960 WILLIAM T. REID. 


Part II. The Accessory Minimum Problem. 


7. Prefatory remarks. In this part we shall consider a minimum 
problem which is essentially the accessory problem associated with an ordinary 
problem of Bolza or an isoperimetric problem of Bolza. In particular, we 
shall establish in Theorem 12.2 a result that implies Theorem 3.3, which 
was fundamental for the expansion proof of Theorem 2.1 given in Section 5. 
Another established result, that of Theorem 12. 1, is of use in the consideration 
of boundary value problems. There is considered first a problem involving 
separated end-conditions and no isoperimetric conditions; the results for the 
more general problem are obtained by reducing such a problem to one of the 
type initially considered, using a well-known transformation. Theorem 8. 1 
below is equivalent to one initially proved by Hestenes [3]; the method of 
proof here given parallels that of Bliss [2}; Secs. 86, 87], after a preliminary 
modification that enables one to avoid the assumptions of normality and 
non-tangency made by him. 

Theorem 8.2 generalizes the results of Sections 3 and 4 of Reid [16]. 
As pointed out by the author [17], priority is due to J. Radon for the 
presentation of the Legendre equation for problems of Lagrange type as a 
matrix differential equation of Riccati type, and the fundamental theorems 
on the integration of such equations. In particular, the results of Sections 3 
and 4 of Reid [16] are contained in the papers [9] and [10] of Radon. 

Throughout this part matrix notation is used extensively: The transpose 
of a matrix M is denoted by I; in particular, vectors y, 7, etc. are treated 
as one-column matrices. 


8. Formulation of the problem not involving isoperimetric conditions. 
Let (& 7) = (ép, (9p t—1,---,n; and 
consider the quadratic functional 


(8. 1) 7) = 27 (é, n(@:), (#2) ) + f 7’) dx, 


in which 2y(é, m1, 72) = 2y(é (nix), (ni2)) is.a quadratic form in the 2n + k 
variables &, ni1, ni2, and 2w(2, 7,7) is a quadratic form in the 2n variables 
ni, 7 With coefficients which are functions of z. The symbol B, will be used 
to designate the variational problem involving (8.1) subject to auxiliary 
first order linear homogeneous differential equations 


(8. 2) 1) = + (2) nj = 9, (a= 1,- 


° 


EXPANSION METHODS FOR THE PROBLEM OF BOLZA. 961 


and a set of linear homogeneous end-conditions 


(8. 3) (21); n(Z2)) Vy, pép -+ Wy, 5193 (21) + Wy, 5203 (£2) = 0, 


It is to be understood that the coefficients of the forms y and Wy are real 
constants; corresponding to the notation in (8.3), the partial derivatives of 
y with respect to the arguments ép, yi1, ni2 are denoted by yp, yi, yo, 
respectively. We write 


2u(2, 7) = Q +: qP 


where R(x), Q(x), P(x) are n X n matrices with R(x) and P(x) symmetric. 
It is supposed that the elements of the matrices R(x), Q(z), P(x), (2) 
= || ||, = || are real-valued continuous functions of 
on an interval containing in its interior the interval 7; = x 22, while $(2) 
is of rank m on this interval. 

A vector n(x) = (mi(x)), will be termed admissible if 
the functions are continuous and have piecewise 
continuous derivatives on 2} SxS-2,. If n(x) is admissible and = (£), 
(ep =1,---+,), are arbitrary constants, the set (&(x)) will be called 
admissible; such a set will be termed null if é&—0, (p—=1,---,k), and 
=0, (41 Se Sam; The accessory problem for an 
ordinary problem of Bolza is a special case of B, in which € = (&, &). 

If Q(2, 9, 7, = o(2,y, 7) + waPa(2, 7,7), the Euler-Lagrange equa- 
tions of B, are 


(d/dx)Qx, (2, (2), p(x)) — On, (2, 1 (2), = 0, 


(8. 4) 


If B, is non-singular, that is, if the (m + m)-order square matrix 


~ 


R(x) 


(8. 5) $(2) 0 


is non-singular on 2,222, then in terms of the canonical variables 
n(x), = Ox, (2, n(x), the system (8.4) may be written 


(8. 6) =Al(x)n+ Bla), & = 


For the explicit forms of A(x), B(x), C(x), see, for example, p. 245 of [16]; 
in particular, the elements of these matrices are continuous and B(x) and 


| 
| 


962 WILLIAM T. REID. 


C(x) are symmetric. If y,¢ and 7*, ¢* are two solutions of (8.6), then 
ac* — &)* = constant ; when this constant is zero these solutions are said 
to be conjugate to each other. As shown in Radon [10], or Reid [16], for 
a non-singular problem B, there exists a set of n solutions » = Ui;;(z), 
Vij(z), (J +, ), of (8.6) with U(x) =| Uij(x) || non-singular 
on 2; S22, if and only if there is on this interval a continuous solution 
W(x) of the associated “ Legendre matrix differential equation ” 


(8.7) W+WA(z) + 4(2)W+ WB(2)W—C(x) =0, 


Indeed, if yi = Uij(x), €: = Vij(x) are n solutions of (8.6) with U(z2) 
non-singular on z; S # S 2, the matrix W(x) = V(x)U~(2) is a continuous 
solution of (8. 7) ; moreover, these solutions of (8.6) are mutually conjugate 
if and only if = V(x)U" (zx) is symmetric. 

For B, condition III is that on 2; =x 2, the quadratic form #R(2x)x 
be non-negative for arbitrary = satisfying ¢(x)7 = 0; correspondingly, 
III’ is the condition that #R(x)z >0 for arbitrary (0) satisfying 
=0, SxS). From Radon [10], or Reid [16], it follows that 
if B, satisfies condition IIT’ then J2(é,7) > 0 for arbitrary non-null admissible 
sets satisfying = 0, ni(71) = 0 if and only if there is 
a continuous symmetric solution of (8.7) on Sa. 

Corresponding to the terminology used in more general variational 
problems, B, is said to involve separated end-conditions if upon suitable 
renumbering of the és there is a k’, (0<%’<k), such that if we set 
= and = (ép-), (p’ =i’ +1,: --,k), then 
2y is of the form + 2y?(&, while the end-conditions 
(8.3) may be written after possible re-ordering, as two systems of the form 


(8. 3”) Wy (21) ) = 0, Wye ) = 0, 


It is to be remarked that either of the sets é' or é* may be non-existent, 
corresponding to k’ = 0 and k’ = k, respectively. 
For the general problem B, we introduce the notation 


(8.8) (a1), 3 ¢) = 2y(E, n(21), + ¢ |] n(21), 


with corresponding definitions of and 2y?(é, (22) ;c¢) for 
a problem with separated end-conditions. In particular, we set 


J2(é,93¢) =I2(E, 7) + VCE, *. 


i 


EXPANSION METHODS FOR THE PROBLEM OF BOLZA. 963 


that is, the expression (8.1) with 2y(é, (21), 7(#2)) replaced by 2y(é(21), 
n(@2)3¢). 

For the results of this part the following theorem is central. 

THEOREM 8.1. Jf problem B, involves separated end-conditions and 
satisfies III’, then a necessary and sufficient condition for Jo(é,7) to be 
positive for arbitrary non-null admissible (&,n(x)) satisfying (8.2), (8.3) 
is that there exist a conjugate system of n solutions ni = Ui; (x), £1 = Vij (x) 
of (8.6) with U(x) =|| Uij(x)|| non-singular on a, SxS22, and which 
satisfies with a suitable constant c= 0 the inequalities 


(8. 9) (é1, 2,) = 2y'(&, (2,) V(2,)a > 0, 


(8.10) bc; a2) U (az) 3c) + OU (a2) V(a2)b > 0, 
for respective arbitrary non-null sets (é',a) = (&p, ai), (€, b) = (Ep, ). 


In view of an elementary theorem on pairs of quadratic forms (see, for 
example, Reid [14]), inequalities (8.9) and (8.10) are equivalent to 
corresponding conditional inequalities used by Hestenes [3] and Bliss [2; 
p. 247]; as pointed out in Section 7, however, the proofs of these corresponding 
inequalities by Hestenes and Bliss involve an additional hypothesis of non- 
tangency. From the remark following equation (8.7) it follows that under 
the substitution a‘ = U(x,)a, =U the result of the above theorem 
is expressed in terms of solutions of the Legendre matrix differential equation 
(8. 7%) as follows. 


THEOREM 8.2. Under the hypotheses of Theorem 8.1, a necessary and 
sufficient condition for J2(&,n) to be positive for arbitrary non-null admissible 
satisfying (8.2), (8.3) is that there exist on Sx a con- 
tinuous symmetric solution W(x) of (8.7%) which satisfies with a suitable 


constant c= 0 the inequalities 
(8. 11) 2y1(S, —@W(2,)a' > 0, 
(8. 12) 2y?(€,a?3c) + @W(22)a? > 0, 


for respective arbitrary non-null sets (&',a*'), (&, a7). 


9. An auxiliary lemma. Preliminary to the proof of Theorem 8.1 we 
shall establish the following result, which holds for the general problem B, 
without the restriction of separated end-conditions. 


Lemma 9.1. Jf B, is non-singular, and J.(é,y) > 0 for arbitrary non- 


964 WILLIAM T. REID. 


null admissible (&,y(x)) satisfying (8.2), (8.3), then there is a constant 
c= 0 such that >0 for arbitrary non-null admissible 
satisfying (8.2). 


This lemma enables one to discard for certain subsequent considerations 
the end-conditions (8.3). In particular, the hypotheses of the lemma imply 
that there is no solution », € of (8.6) with y(2,) =0—7n(z2.), while 
n(z) 40 on 2, SaS-2,. If the order of abnormality of (8.6) on 


linearly independent solutions of (8.6) such that =0, Sa; 
s=1,---+,7), then the 2n X (2n—r,) matrix 
(9. 1) ,2n) 
nit 


has rank 2n—ro. In particular, the hypotheses of the lemma imply that for 
all non-null sets of constants (&p», z+) such that 


(9. 2) Ep, ni = nit (p= 1,° 


satisfies |! we have J(é,7) >0. Hence by the 
theorem on quadratic forms of Reid [14] there is a constant c= 0 such that 
>0 for all (é,y(x)) of the form (9.2) with non-null (ép, zz). 
Finally, under the hypotheses of the lemma, if (é,(a)) is admissible and 
(8.2) holds, it is well-known (see, for example, Bliss [2], p. 233) that there 
is a unique set of constants z; such that (a) = yit(v)2t satisfies (21) 
= 7i(21), =i (@2). Since n*i(x), = ie(x) 2 is a solution 
of (8.6) it then follows that 


J2(é 936) =J2(E n*3¢) + 3c) = J2(E, * 50), 


and the equality holds only if = *(x) on S S22. However, y*(x)) 
is of the form (9.2) and satisfies (8.2), so that J2(é, n*;c) = 0, and 
the equality holds only if (£,y*(«)) is null. Consequently, the result of the 
lemma holds for c determined as above. 


10. Proof of Theorem 8.1. Let c be a constant such that the result 
of Lemma 9.1 holds for 


®2 
+ 2w (a, UB ) dx. 


| 


or 


we 


t 


EXPANSION METHODS FOR THE PROBLEM OF BOLZA. 965 


Then either the set £* is non-existent, or the quadratic form 2y!(é',0;¢) in 
&'= (p’ -, p’), is positive definite; in particular, in this latter 
case for arbitrary values of »(21) = (7i(x1)) the system of linear equations 


(10. 2) yer 3 ¢) = 0, (p’ =1,- 


determines unique values of ' = (ép-). Correspondingly, either the set £ is 
non-existent, or the quadratic form 2y?(é,0;c) in & = (&"), =p’ +1, 
+ +,p), 1s positive definite and for arbitrary 7(22) the system 


(10. 3) n(@2) = 0, (p’ =p’+1,---,p), 


determines unique values of & = (gp). In the following we shall suppose 
that both the sets ¢' and €* are existent ; the simplifications which occur when 
either of these sets is non-existent are obvious. 

Corresponding to equations (86.4) of Bliss [2], let the columns of 
U*(x) = || |, = || Ves (x) || be solutions & of (8.6) such 
that their end-values + x, form with corresponding constants & == é&p»; a set 
of n linearly indepencent solutions of the algebraic equations 


(10.4) =0, —& (a1) + 5c) =0. 


It follows immediately that is non-singular, and that = U',;(2), 
(J =1,- are muiually conjugate solutions. of (8.6). 
Also U1(a) is non-singular on 7, = x = 2», since otherwise there would exist 
a solution »*(x), €*(x) of (8.6) with »*(2,) +40, which satisfies (10. 4) 
with suitable and is such that =0, where 7, 2,. Then 
n(x) (2), (US n(x) =0 on F=—0, 
would be a non-null admissible set (&,7(x)) satisfying (8.2), and for 
which J,(é,7;c) = 0, contrary to Lemma 9.1. Hence is non-singular 
ont; S2S2,. Similarly, if the columns of U?(x), V?(x) are solutions of 
(8.6) whose end-values at 2, form with corresponding & = ép»; a set of n 
linearly independent solutions of the algebraic system 


(10. 5) (&, n (22) == (), (2) + = 0, 


then these solutions of (8.6) are mutually conjugate, and U?(z) is non- 
singular on 2; S22... For brevity, we introduce the notation =' for the 
k’ X n matrix || €p-; || , and =? for the (kt —k’) X n matrix || &p-; || . 

If 2, < 23 < 2, and a= (a;), b = (b;) are such that 


| 
) ~ 
| 
e 
t 
d 
e 
) 
n 
) 


966 WILLIAM T. REID. 


(10. 6) U*(a3)a U?(x3)b, 
then the set (é,7(x)) defined as 


& 


is admissible and satisfies (8.2). Moreover, corresponding to the proof of 
Theorem 86.1 of Bliss [2], for this ((2)) we have that J2(é,;c) is 
equal to 


(10. 7) V*(23)U?(x3) — U* (23) V2(as) 


and hence (10.7) is positive for all non-null sets (a,b) satisfying (10.6). 
Since U*(x) and U*(x) are non-singular on x, x, the matrix V'(x)U?(z) 
— U(«)V?(x), which is necessarily a constant matrix, is also non-singular on 
this interval. In particular, for a suitable choice of U?(x), V?(a) this 
matrix reduces to the identity matrix, and we shall suppose that such a choice 
is made; of course, such a choice of U*(xz), V(x) entails a corresponding 
choice of =*. With this choice we now define U(r) = U'(z) + U?(za), 
V(z) = V*(x) + V*(z). As in Bliss [2; p. 248], it follows readily that 
the columns of U(x), V(x) are mutually conjugate solutions of (8.6), and 
that U(x) is non-singular on 2; Sx=-2,. If the constant matrix K is 
defined by U'(z,)K =U(a,), and we set then for arbitrary 
a=(a;) we have y'p-(Z'*a, U(2,)a;c) =0, (p’ =—1,---,k’), and hence 
for arbitrary ({,a) = ai), 


U(a:)a;c) (Ea, U(21)a; ¢) + — Bia, 0; ¢) 


= y'(E'*a, U(21)a;¢), 


and the equality sign holds if and only if é'—'*a. Moreover, by the 
argument of Bliss [2; pp. 248, 249], 


(10. 8) 2y1(=*a, U(2,)a;¢) —a0 V(a,)a= aa. 


Indeed, in view of the non-singularity of U*(«,), the argument of Bliss may 
be replaced by a simple direct calculation to show chat the left-hand member 
of (10.8) has the value da + da*, where a* is defined by U*(a,)a* = U*(a,)a, 
and hence da* => 0. Consequently, (8.9) holds if (é',a) is a non-null set. 


Ww 


( 

( 
t 

i 
( 
il 
t) 
t 
e 
8 

$1 
( 


EXPANSION METHODS FOR THE PROBLEM OF BOLZA. 967 


In a similar fashion it follows that (8.10) holds for a non-null set (€, dD). 

Conversely, suppose that the columns of U(x), V(x) are mutually 
conjugate solutions of (8.6) with non-singular on and 
satisfying (8.9) and (8.10). Let pa=ypaj(x) be the corresponding multi- 
pliers for these solutions i= Ui;(r), €: = Vij(x) of (8.6). If n(x) is 
admissible, 


(10. 9) = U(z)h(z2), 


defines an admissible h(x) =h(x;y) = (hi(x)). The so-called Clebsch 
transformation is then equivalent to the identity 


(10. 10) 22(2, 7’, = + (hUVhY’, 
where 
(10.11) pa = = (x), = U(x) 


In particular, if (,»(2)) is an admissible set satisfying (8.2), (8.3), 


(10.12) Jo(é, 4) = 20° h(a) 5 + 21°(&, ; 


v2 
dx. 
71 


Since (8.2) holds we have ¢(x2)u(2) =0, and from III’ it follows that the 
integral in (10.12) is non-negative, and equal to zero if and only if u(r) =0; 
that is, if and only if h’(2) =0. Moreover, by (8.9) and (8.10), the first 
two terms of the right member of (10.12) are non-negative, and are 
equal to zero if and only if (&',h(2,)) = (0,0) and (&,h(a2)) = (0,0), 
respectively. That is, J:(é,7) for arbitrary admissible sets 7(x)) 
satisfying (8.2), (8.3), and the equality sign holds only if (é,»(x)) is null. 


11. Results related to Theorems 8.1 and 8.2. If B. involves separated 
end-conditions, satisfies the non-singularity condition, and J.(é,7) 20 for 
arbitrary admissible (¢,7(a)) satisfying (8.2), (8.3), then if (&,»(2)) is 
such a set for which /J.(é,7) =0 it is well-known that there exists a £(z) 
such that y, ¢ is a solution of (8.6); moreover, there are constants dy, dy», 
with which €,n(27,), &(%,) and 
€(a2) satisfy the “transversality conditions ” ° 


®* See, for example, Bliss [2; p. 232]; indeed, the transversality conditions (11.1) 
will involve only p—r, parameters d,,, d,,, if the problem B, is abnormal or order r,. 


5 


1 


| 
| 


968 WILLIAM T. REID. 


yer n(X:)) + dy Vy, =) = ya (é, n(2:)) + dy 
2 Le dy Gyr, 
(11.1) yer (é p 
) + dy Vy + & (Z2), 


On the other hand, if B, has separated end-conditions, satisfies III’, and 
J2(é,7) >0 for arbitrary non-null admissible satisfying (8. 2), 
(8.3), the definiteness of J.(£,4) may be characterized more precisely. If 
U(x), V(x) and ¢ are determined as in Theorem 8. 1, the definiteness of the 
quadratic forms (8.9), (8.10) implies that there is a constant x’ >0 such 
that 


(11. 2) 9/49 , 2 2 


Now, as already pointed out in Section 3, in view of III’ there is a constant 
1= 0 such that for 2w(2, , 7/31) = 2o(2, n, 7’) +1 || &(2, 7’) || ? the corre- 
sponding matrix = R(x) +1¢(xz)¢(x) is positive definite on 
Moreover, the replacement of (2, y, 7’) by 7’;1) leaves 
unaltered the coefficients of (8.6); in particular, the solutions »; = U;;(z), 
£: = Vi;j(x) of these equations determined in Theorem 8.1, and the corre- 
sponding multipliers peg —ypaj(x), are unaltered by this substitution. If u 
is defined as in (10.11), then there is a x” >0O such that @R(2;1)u 
=x" || h’(x)||? on 2, Now by elementary inequalities, 


<2) h(x) f rae, 
and hence 


Finally, there exist positive constants x2, xz, dependent only upon U(x), U’(2), 
(z,Sx2S2.). Considering (10.10), (10.12) for the corresponding 
Q(z, 7, 7’,23;1), a simple combination of the above inequalities results in the 
existence of a positive constant « such that for arbitrary admissible (é, (2) ), 


(11.3) ¢) + 2y?(&, n(@2) 


| 

| 
( 
( 
( 
e 
i 
( 


EXPANSION METHODS FOR THE PROBLEM OF BOLZA. 


where 


+ f Unk? + 


That is, we have established the following result. 


THEOREM 11.1. Jf B, involves separated end-conditions, satisfies IIT’ 
and = 0 for arbitrary admissible (é,n(x)) satisfying (8.2), (8.3), 
then either there is a non-null admissible (é,) satisfying (8.2), (8.3), for 
which J2(é,y) =0, and which with associated £(x), dv, dv» is a solution of 
the differential equations (8.6) and the transversality conditions (11.1), or 
there exist constants c= 0,120, x > 0 such that (11.3) holds for arbitrary 
admissible (&,y(«)); in particular, in this latter case, 


(11. 5) I(& 0) = 
for arbitrary admissible (&,n(x)) satisfying (8.2), (8.3). 


If B, involves separated end-conditions, satisfies III’, and J.(é,) >0 
for arbitrary non-null admissible (x) ) satisfying (8.2), (8.3), then for 
U(x), V(x), ¢ determined as in Theorem 8.1 there is a 5) >0 such that 
U(x) remains non-singular on 7, — 8 Sa 2, + 8, while if | X,— 2, | < &, 
| X2.—a.| <8 the inequalities 2M(#,a;¢;X,) 
2), corresponding to (11.2), hold for 
a suitable positive x,. Now if ((=1,- are arbitrary absolutely 
continuous functions on the corresponding functions h;(z) 
defined by (10.9) are also absolutely continuous on this interval. If 
Pa are defined by (10.11), and = 
clearly the integral 


Xo 


exists. Indeed (11.6) is the Hilbert invariant integral for the problem B, 
in the field determined by the mutually conjugate solutions »; = U;;(z), 
fi = Vij(x) of (8.6). Moreover, since 7’; =7i + ui, where u is defined by 
(10.11), it is a consequence of the identity (10.10) that 


Consequently, we have the following result. 


969 
| | 
| 
Xe 


970 WILLIAM T. REID. 


THEOREM 11.2. Jf B, involves separated end-conditions, satisfies IIT’, 
and J.(é,) > 0 for arbitrary non-null admissible (&,y(x)) satisfying (8. 2), 
(8.3), then there exist constants x; > 0, 8 > 0, ¢ 20 such that the 
of Theorem 8.1 remains non-singular on x, —8 8, and if the 
components of (x) are absolutely continuous on X,;SxSX,., where 
ay, | <8, | | < &, then 


(11.7) 5c) + 50) + Xe) 
with h(x) defined by (10.9) and J*2(y;N1,X2) by (11.6). 


It is to be remarked that the coefficient of wa in the integrand of 
J*,(7;X1, X2) is equal to y, 7) + dai(X) — mi) = Pa (2, 9, 7’), and 
hence this coefficient is zero for any particular « such that ®q(z, yn, »’) =0. 
This fact will be utilized in the following section. 


12. A problem involving isoperimetric conditions. For a problem B, 
not involving separated end-conditions, results analogous to those of Theorems 
11.1 and 11.2 may be obtained by reducing such a problem, by a well-known 
type of transformation (see, for example, Bliss [2], Secs. 69, 88), to one 
involving separated end-conditions. For brevity, we consider immediately a 
more general problem involving isoperimetric conditions. The notation B., 
will be assigned to the variational problem involving the quadratic func- 
tional (8.1) subject to (8.2), (8.3), and the isoperimetric conditions 


(12.1) X(é 9321, 2) = xs (&9(%), (22) ) 
+ 6* dx = 0, (s= 


where the xz are linear forms in ép, mi (21), 7i(%2) with real coefficients, and 
the functions $*.;(z), 6*;(a) are real-valued continuous functions on an 
interval containing z, = 7 S 2, in its interior. 


The Euler-Lagrange equations for Bz; are 


(12. 2) (d/dz) (z, “f p) Csh*si} {Qn, (z, + si} = 0, 
©, (2, 0, 


where the constants e;, (s=1,---,q), are “isoperimetric parameters.” 
The corresponding transversality conditions are 


| | 


EXPANSION METHODS FOR TIE PRCHLEM OF BOLZA. 


ye(€,9(21), 9(X2)) + + = 0, 
via (41), 9 (22) ) + 41 + esxs, 41 
(12.3) — [Qn,(2, 1, + = 0, 
yi2(E, 9(@2)) + + 
+ (xm, + esh* si = 0. 


If functions 7°; (x), (= 1,- + -,q), are defined as 


the problem B,.; is equivalent to the problem B, in (é 2) = (&p, Nc) 
= (&, ni Involving the quadratic functional 


subject to the first order linear homogeneous differential equations 
(12.6) a(x, », 7’) =0, 
(12.7) =0, + + = 0, 
and the linear homogeneous end-conditions 
(12.8) = 0, 
(12.9) = 0, (42) = 0, 
xs (E, 7(41), 9° (41) ) + = 0. 


The problem B, clearly involves separated end-conditions; moreover, for 
B. each of the conditions of non-singularity, III, or III’, is equivalent to 
the corresponding condition for the initial problem B., or for the isoperimetric 
problem B.;. For a non-singular problem the canonical form of the Euler- 
Lagrange equations for B, will be denoted by 


(12. 10) =C(x)n—A(z)%. 


If (8.1) is positive for arbitrary non-null admissible (, (x) ) satisfying 
(8.2), (8.3), (12.1), then (12.5) is positive for arbitrary non-null ad- 
missible (é, (x) ) = (x), satisfying (12. 6), (12. 7), (12. 8), 
(12.9). In this case we shall denote by Ne—Uao,(x), S¢o—Vor(z). 
(r=1;---,m=—2n-+p), a conjugate system of solutions of (12.10) with 


971 
| 
| 


972 WILLIAM T. REID. 


U(x) Uc;(x)|| non-singular on and satisfying for B, the 
inequalities (8.9), (8.10) of Theorem 8.1. Now suppose that this condi- 
tion is satisfied, that is admissible, and define by 
(12.4); then the differential equations (12.7) are satisfied, as well as the 
first 2n conditions ef (12.9). For such an 


(12.11) a(x) = (ni (2), ni (22), + dx) 


the equation 
(12. 12) n(x) =U(z)h(z), (q% 


corresponding to (10.9), defines h(x) = (ho(x)), (o=1,---,m), with 
admissible components. For the solutions =Vor(z) of 
(12.10) there are corresponding multipliers which 
with Uc,;(x) satisfy the ordinary form of the Euler-Lagrange equations for 
B.; in particular, =Vnyi,r(@) and and are 
constants. In terms of the Euler equations for B.;, we have that y; = U;,(z), 
Pa = War(x), =Vonse,r are solutions of (12.2). For an admissible 
we shall denote by the values where h(r) is 
defined by (12.11), (12.12). In view of the auxiliary differential equations 
and end-conditions for B, satisfied by an 4 of the form (12.11), and since 
| |] S | a(x) |], || S |] ||, the following result for B.; is an 
immediate consequence of Theorem 11.1 for Bs. 


THEOREM 12.1. Jf Bar satisfies III’, and J2(é,n) 20 for arbitrary 
admissible (é,y(xz)) satisfying (8.2), (8.3), (12.1), then either there is a 
non-null admissible (é,n(x)) satisfying (8.2), (8.3), (12.1) for which 
J2(é,4) =0, and which satisfies the differential equations (12.2) and trans- 
versality conditions (12.3) with associated pa(x), es, dv, or there exist 
constants c= 0,120, « >0 such that for arbitrary admissible (&, n(2z)), 


2y(E, (21), + |? + X(E 95 22) [l?) 


+ f 20(2, > w(x; 7) 1) dx = «I (é, 


(12. 13) 

where 

(12.14) 1(& 9) = + + 


In particular, in this latter case, J2(é,n) = «I (é,y) for arbitrary admissible 
(n(x) satisfying (8.2), (8.3), (12.1). 


& 


| 
| 
7 | 
| 
| 


EXPANSION METHODS FOR THE PROBLEM OF BOLZA. 973 


In view of the comment after Theorem 11.2 we have for Bz; the 
following result as a consequence of Theorem 11.2 for B.. 


THEOREM 12.2. Jf B., satisfies III’, and J2(é, 7) > 0 for arbitrary non- 
null admissible (é,(x)) satisfying (8.2), (8.3), (12.1), then there exist 
constants 8) > 0, = > 0 such that the U(x) = || of Theorem 
8.1 for B, remains non-singular on 2, —8 8, and if the com- 
ponents of n(x) are absolutely continuous on X, SxS X2, where | X,—a, | 
< | X2— a2 | < then 


(12.15) (Xi), 9(X2)) + X2) 
= € |? + | + | 2(X2) |?) 
— 0(X1), + X (E03 Xe) *), 


in which h(x) = (h-(x)), (r=1,: is defined by 
Ui, (x)h,(z) = = 91 (X2), 


(12. 16) | 

with == (234) (2), Yo = 9) = (x), where 
War(z) are corresponding multipliers for the Uj;r(x), and 


X2 
(12. 17) X») = 2 f (20. *, + )Qx,(z, } da. 


13. Concluding remarks. Theorem 12.1 is of use in the consideration 
of self-adjoint boundary problems. For example, in the boundary problem 
considered by the author [11] this theorem enables one to establish the 
existence of characteristic values and corresponding characteristic solutions 
possessing definitive extremizing properties. Such a proof is simpler in detail 
than that employed in [11], which utilized the Green’s matrix and generalized 
Green’s matrix for differential systems. 

For ordinary problems of Bolza involving separated end-conditions 
Theorem 11.1 has the effect of allowing one to disregard in a certain sense 
the auxiliary differential equations and end-conditions. More precisely, 
suppose H:y:(x), A—=1, A(X), is an extremal for such a 
problem of Bolza involving (2.1) subject to (2.3) and (2.4), and satisfying 
with constants e, the multiplier rule and conditions III’, IV’. If g is replaced 
by g+ and f by f+ + where c, J and 
Pa(%37) are as in Theorem 11.1, then the resulting minimum problem has 


y 

] 
i 
| 


974 WILLIAM T. REID. 


separated end-conditions, is equivalent to the original problem, while for this 
new problem / is an extremal satisfying the multiplier rule with the same e,, 
and for which the second variation is positive for arbitrary non-null admissible 
sets. The proof of this result here given is of no value in reducing the 
sufficiency proof for such a problem of Bolza to a simpler problem involving 
no auxiliary differential equations and end-conditions, however, since the 
fundamental properties of the accessory problem for the general Bolza problem 
have been used in its derivation. One could prove this result directly, how- 
ever, in a manner similar to that by which Hestenes has proved Lemma 3. 2 
of [8] for his formulation of the general non-parametric problem of Bolza. 
Indeed, for such a problem Hestenes showed that by suitable modification 
of the integrand function one could obtain the sufficiency theorem for the 
given problem from the sufficiency theorem for the modified pr~hlem inv«!ving 
no side differential equations. For an ordinary problem of Bolza not inv_iving 
separated end-conditions, or for an isoperimetric problem of Boiza, the result 
of Theorem 12.1 does not lead in a corresponding fashion to a problem of 
the original type for which the effect of the auxiliary differential equations 
and end-conditions is eliminated for the second variation. However, this 
property is enjoyed by an equivalent formulation of the general problem of 
Bolza, which may be described briefly as a hybrid of that employed by 
Hestenes and the one here used. 


NORTHWESTERN UNIVERSITY. 


REFERENCES. 


1. G. A. Bliss, “ The transformation of Clebsch in the calculus of variations,” 
Proceedings of the International Mathematical Congress held in Toronto, vol. 1 (1924), 
pp. 589-603. 

2. 
1946. 

3. M. R. Hestenes, “‘ Sufficient conditions for the problem of Bolza in the calculus 
of variations,” Transactions of the American Mathematical Society, vol. 36 (1934), pp. 
793-818. 

4. M. R. Hestenes and W. T. Reid, “ A note on the Weierstrass condition in the 
ealculus of variations,” Bulletin of the American Mathematical Society, vol. 45 (1939), 
pp. 471-473. 


, Lectures on the calculus of variations, University of Chicago Press, 


EXPANSION METHODS FOR THE PROBLEM OF BOLZA. 


5. M. R. Hestenes, “* Generalized problem of Bolza in the calculus of variations,” 
Duke Mathematical Journal, vol. 5 (1939), pp. 309-324. 

6. 
American Mathematical Society, vol. 48 (1942), pp. 57-75. 


—, “The problem of Bolza in the calculus of variations,” Bulletin of the 


7. ———, “Suilicient conditions for the isoperimetric problem of Bolza in the 
calculus of variations,” Transactions of the American Mathematical Society, vol. 60 
(1946), pp. 93-118. 

8. ——, “An indirect sufficiency proof for the problem of Bolza in nonpara- 
metrie form,” Transactions of the American Mathematical Society, vol. 62 (1947), pp. 
509-535. 

9. J. Radon, “Uber die Oszillationstheoreme der konjugierten Punkte beim Prob- 
leme von Lagrange,” Miinchener Sitzungsberichte, vol. 57 (1927), pp. 243-257. 

10. ———, “ Zum Problem von Lagrange,” Abhandlungen aus dem Mathematischen 
Seminar Hamburg, vol. 6 (1928), pp. 273-299. 

11. W. T. Reid, “ A boundary value problem associated with the calculus of 
variations,” American Journal of Mathematics, vol. 54 (1932), pp. 769-790. 

12. -, “ Sufficient conditions by expansion methods for the problem of Bolza 
in the calculus of variations,” Annals of Mathematic, vol. 38 (1937), pp. 662-678. 


13. ——, “A direct expansion proof of sufficient conditions for the non-para- 
metric problem of Bolza,” Transactions of the American Mathematical Society, vol. 42 
(1937), pp. 183-190. 

14. ———., “A theorem on quadratic forms,” Bulletin of the American Mathe- 
matical Society, vol. 44 (1938), pp. 437-440. 

15. ———, “Isoperimetric problems of Bolza in non-parametric form,” Duke 
Mathematical Journal, vol. 5 (1939), pp. 675-691, 

16. ———, “A matrix differential equation of Riccati type,” American Journal 
of Mathematics, vol. 68 (1946), pp. 287-246. 

17. ———. “ Addendum ” to preceding paper, American Journal of Mathematics, 
vol. 70 (1948), p. +60. 


9%d 
is 
V> 
le 
e 
e 
n 
> 
> 
| 


